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Abstract 

Quantum /-divergences are a quantum generalization of the classical notion of /- 
divergences, and are a special case of Petz' quasi-entropies. Many well-known distin- 
guishability measures of quantum states are given by, or derived from, /-divergences; 
special examples include the quantum relative entropy, the Renyi relative entropies, and 
the Chernoff and Hoeffding measures. Here we show that the quantum /-divergences are 
monotonic under substochastic maps whenever the defining function is operator convex. 
This extends and unifies all previously known monotonicity results for this class of distin- 
guishability measures. We also analyze the case where the monotonicity inequality holds 
with equality, and extend Petz' reversibility theorem for a large class of /-divergences 
and other distinguishability measures. We apply our findings to the problem of quantum 
error correction, and show that if a stochastic map preserves the pairwise distinguisha- 
bility on a set of states, as measured by a suitable /-divergence, then its action can be 
reversed on that set by another stochastic map that can be constructed from the original 
one in a canonical way. We also provide an integral representation for operator convex 
functions on the positive half-line, which is the main ingredient in extending previously 
known results on the monotonicity inequality and the case of equality. We also consider 
some special cases where the convexity of / is sufficient for the monotonicity, and ob- 
tain the inverse Holder inequality for operators as an application. The presentation is 
completely self-contained and requires only standard knowledge of matrix analysis. 



1 Introduction 

In the stochastic modeling of systems, the probabilities of the different outcomes of possible 
measurements performed on the system are given by a state, which is a probability distribution 
in the case of classical systems and a density operator on the Hilbert space of the system in the 
quantum case. In applications, it is important to have a measure of how different two states are 
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from each other and, as it turns out, such measures arise naturally in statistical problems like 
state discrimination. Probably the most important statistically motivated distance measure 
is the relative entropy, given as 

S(p\\a) ■= / Tr ^( lo SP _lo g°')' SU PPP < supp a, 
l+oo, otherwise, 

for two density operators p, a on a finite-dimensional Hilbert space. Its operational inter- 
pretation is given as the optimal exponential decay rate of an error probability in the state 
discrimination problem of Stein's lemma ETJ |371 HI], and it is the mother quantity for 
many other relevant notions in information theory, like the entropy, the conditional entropy, 
the mutual information and the channel capacity HI]. 

Undisputably the most relevant mathematical property of the relative entropy is its mono- 
tonicity under stochastic maps, i.e., 

S(Q(p)\\Q(*)) < S(p\\v) (1.1) 

for any two states p, a and quantum stochastic map <3> jH] . Heuristically, (11. ip means that 
the distinguishability of two states cannot increase under further randomization. The mono- 
tonicity inequality yields immediately that if the action of $ can be reversed on the set {p, cr}, 
i.e., there exists another stochastic map ^> such that ^($(p)) = p and \l/($(er)) = cr, then $ 
preserves the relative entropy of p and a, i.e., inequality (II. ip holds with equality. A highly 
non-trivial observation, made by Petz in P2| H3], is that the converse is also true: If $ pre- 
serves the relative entropy of p and a then it is reversible on {p, cr} and, moreover, the reverse 
map can be given in terms of $ and a in a canonical way. This fact has found applications in 
the theory of quantum error correction [2U [221 EE] , the characterization of quantum Markov 
chains [TS] and the description of states with zero quantum discord [TUl E], among many 
others. 

Relative entropy has various generalizations, most notably Renyi's cv-relative entropies [16] 
that share similar monotonicity and convexity properties with the relative entropy and are also 
related to error exponents in binary state discrimination problems 0,EI]. A general approach 
to quantum relative entropies was developed by Petz in 1985 [10], who introduced the concept 
of quasi-entropies (see also [H] and Chapter 7 in [3H])- Let A := B(C n ) denote the algebra of 
linear operators on the finite-dimensional Hilbert space C n (which is essentially the algebra of 
n x n matrices with complex entries, and hence we also use the term matrix algebra). For a 
positive A e A and a strictly positive B £ A, a general K E A and a real-valued continuous 
function / on [0, +oo), the quasi-entropy is defined as 

Sf(A\\B) := (KB 1 / 2 , fiAiA/B^KB 1 / 2 ))^ = Tr B^K* f (A (A/ B))(KB^ 2 ), 

where (X, F)hs := TrX*Y, X, Y G A, is the Hilbert- Schmidt inner product, and A (A/B) : 
A —> A is the so-called relative modular operator acting on A as A (A/B) X := AXB~ l , X £ 
A. The relative entropy can be obtained as a special case, corresponding to the function 
f(x) := xlogx and K := I, and Renyi's a-relative entropies are related to the quasi-entropies 
corresponding to f(x) := x a . 

The two most important properties of the quasi-entropy are its monotonicity and joint 
convexity. Let $ : A\ — > A2 be a linear map between two matrix algebras Ai and A2, and 
let $* : A2 — > Ai denote its dual with respect to the Hilbert-Schmidt inner products. A 
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trace-preserving map $ : — ^ ^4 2 is called a stochastic map if $* satisfies the Schwarz 
inequality $*(Y*)$*(Y) < Y G A%- The following monotonicity property of the 

quasi-entropies was shown in [T01 ST]: Assume that / is an operator monotone decreasing 
function on [0, +00) with /(0) < and $ : — > A2 is a stochastic map. Then 



holds for any K G A% and invertible positive operators A, B G A±. If / is an operator convex 
function on [0, +00), then Sf(A,B) is jointly convex in the variables A and B [391 SOI SI], 
i.e., 



for any finite set of positive invertible operators A i: Bi G A and probability weights {pi}- 

Quasi-entropy is a quantum generalization of the /-divergence of classical probability dis- 
tributions, introduced independently by Csiszar [8] and Ali and Silvey [JJ, which is a widely 
used concept in classical information theory and statistics [30l [3Tj . This motivates the termi- 
nology "quantum /-divergence" , which we will use in this paper for the quasi-entropies with 
K = I. Actually, our notion of /-divergence is also a slight generalization of the quasi-entropy 
in the sense that we extend it to cases where the second operator is not invertible. This ex- 
tension is the same as in the classical setting, and was already considered in the quantum 
setting, e.g., in [50]. We give the precise definition of the quantum /-divergences in Section El 
where we also give some of their basic properties, and prove that they are continuous in their 
second variable; the latter seems to be a new result. In Section [3] we collect various technical 
statements on positive maps, which are necessary for the succeeding sections. In particular, 
we introduce a generalized notion of Schwarz maps, and investigate the properties of this class 
of positive maps. 

The monotonicity Sf(<&(A)\\<&(B)) < Sf(A\\B) of the /-divergences was proved in [HJ for 
the case where / is operator monotonic decreasing and $ is a stochastic map, and where / is 
operator convex and $ is the restriction onto a subalgebra; in both cases B was assumed to 
be invertible. This was extended in |29j to the case where / is operator convex, $ is stochastic 
and both A and B are invertible, using an integral representation of operator convex functions 
on (0, +00), and in [50] to the case where / is operator convex and $ is a completely positive 
trace-preserving map, without assuming the invertibility of A or B, using the monotonicity 
under restriction onto a subalgebra and Lindblad's representation of completely positive maps. 
In Section SI we give a common generalization of these results by proving the monotonicity 
relation for the case where / is operator convex, $ is a substochastic map which preserves 
the trace of B, and both A and B are arbitrary positive semi definite operators. This is 
based on the continuity result proved in Section [2] and an integral representation of operator 
convex functions on [0, +00) that we provide in Section [HI To the best of our knowledge, this 
representation is new, and might be interesting in itself. 

It has been known [Ml |25| W2\ for the relative entropy and some Renyi relative entropies 
that the monotonicity inequality for two operators and a 2-positive trace-preserving map holds 
with equality if and only if the action of the map can be reversed on the given operators. We 
extend this result to a large class of /-divergences in Section where we show that if a 
stochastic map $ preserves the /-divergence of two operators A and B corresponding to a 
non-linear operator convex function with no quadratic term then it preserves a certain set of 
"primitive" /-divergences, corresponding to the functions ipt(x) : = —x/(x + 1) for a set T of 
£'s. Moreover, if this set has large enough cardinality (depending on A, B and $) and $ is 
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2-positive then there exists another stochastic map \P reversing the action of $ on {A,B}, 
i.e., such that \I/($(v4)) = A and ^($(5)) = B. In Section [6l we formulate equivalent condi- 
tions for reversibility in terms of the preservation of measures relevant to state discrimination, 
namely the Chernoff distance and the Hoeffding distances, and we also show that these mea- 
sures cannot be represented as /-divergences. In Section [7] we apply the above results on 
reversibility to the problem of quantum error correction, and give equivalent conditions for 
the reversibility of a quantum operation on a set of states in terms of the preservation of pair- 
wise /-divergences, Chernoff and Hoeffding distances, and many-copy trace-norm distances. 
Related to the latter, we also analyze the connection with the recent results of [6], where 
reversibility was obtained from the preservation of single-copy trace-norm distances under 
some extra technical conditions, and show that the approach of [B] is unlikely to be recovered 
from our analysis of the preservation of /-divergences, as the quantum trace-norm distances 
cannot be represented as /-divergences. This is in contrast with the classical case, and is 
another manifestation of the significantly more complicated structure of quantum states and 
their distinguishability measures, as compared to their classical counterparts. 

In our analysis of the monotonicity inequality Sf(&(A) ||$(i?)) < Sf(A\\B) and the case of 
the equality, it is essential that / is operator convex; it is an open question though whether 
this is actually necessary. In Appendix |A] we consider some situations where convexity of / is 
sufficient; this includes the case of commuting operators, which is essentially a reformulation of 
the classical case, and the monotonicity under the pinching operation defined by the reference 
operator B, which was first proved in [2] for the Renyi relative entropies. Although both 
of these cases are very special and their proofs are considerably simpler than the general 
case, they are important for applications. As an illustration, we derive from these results the 
exponential version of the operator Holder inequality and the inverse Holder inequality, and 
analyse the case when they hold with equality. 

2 Quantum /-divergences: definition and basic proper- 
ties 

Let A be a finite-dimensional C*-algebra. Unless otherwise stated, we will always assume 
that A is a C*-subalgebra of £>("H) for some finite-dimensional Hilbert space "H, i.e., A is a 
subalgebra of B(T-L) that is closed under taking the adjoint of operators. For simplicity, we 
also assume that the unit of A coincides with identity operator / on H; if this is not the case, 
we can simply consider a smaller Hilbert space. The Hilbert-Schmidt inner product on A is 
defined as 

(A, B) HS := TrA*B, A, B e A, 

with induced norm ||A|| HS := VTr A* A, A e A. 

We will follow the convention that powers of a positive semidefinite operator are only taken 
on its support; in particular, if < A £ A then A -1 denotes the generalized inverse of A and 
A is the projection onto the support of A. For a real t £ R, X %t is a unitary on supp A but 
not on the whole Hilbert space unless A = /. We denote by log* the extension of log to the 
domain [0,+oo), defined to be at 0. With these conventions, we have ^A 2 |^ =Q = log* A. 
We also set 

• ±oo := 0, logO := — oo, and log +oo := +oo. 
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For a linear operator A £ A, let L^, £ 13(A) denote the left and the right multiplica- 
tions by A, respectively, defined as 

L A '■ X i-> AX, R A : X h-> 1A, X £ A 

Left and right multiplications commute with each other, i.e., L A R B — RbL a , A,B £ A. 
If A, B are positive elements in A with spectral decompositions A = ^2 a & P ec(A) a ^ and 
B = Xlfeespcc(B) bQb (where spec(X) denotes the spectrum of X £ A) then the spectral 
decomposition of L A R B -i is given by L A R B -i = T l a& pcc (A)T l bes PC c(B) ab ~ lL Pa R Q b i and for 
any function / on {ab' 1 : a £ spec(A), b £ spec(B)}, we have 

f(L A R B -,)= Yl E fipb-^Lp^. (2.1) 

aCEspcc(yl) fe€spec(_B) 

(Note that we have O" 1 = in the above formulas due to our convention.) 

2.1 Definition. Let A and B be positive semidefinite operators on % and let / : [0, +oo) — > 
R be a real-valued function on [0, +oo) such that / is continuous on (0, +oo) and the limit 

u(f) := hm 



x-^+oo X 

exists in [— oo, +oo]. The f -divergence of A with respect to B is defined as 

S f (A\\B):= (BV\f(L A R B -,)B^)Ks 
when supp A < suppi?. In the general case, we define 

S f (A\\B) : = lim S f (A\\B + el). (2.2) 

e\0 

2.2 Proposition. The limit in (12. 2p exists, and 

lim S f (A\\B + el) = (P 1/2 , / (L A R B -,) B 1 ' 2 )^ + u(f) TrA(I - B°). 

In particular, Definition 12.11 is consistent in the sense that if supp A < supp B then 

\hnS f (A\\B + eI) = (B l '\ f (L A R B -i) B 1 / 2 )^. 

Proof. By we have S f (A\\B + el) = Eaes Pec (A) E b&pcc{ B)(b + e)f(a/(b + e)) Tr P a Q b , 

and the assertion follows by a straightforward computation using that for any a, b > 0, 

lim j /(a/5) = /»/(«/»). ">°. (2 . 3) 
o<5-^6 I aoj(f), b = 0. q 



2.3 Corollary. For A, P and / as in Definition 12. 1\ 

S f (A\\B) = (P 1 / 2 , f (L A R B -i) B^hs + Tr A(/ - J3°) (2.4) 
= f(0) Tr P + (P 1 / 2 , (/ - f(0)) (L A R B -i) BV% S + u(f) Tr A(I - B°) (2.5) 

E ( £ 6/(a/6)TrP a Q 6 + au;(/)TrP a Qo), 



aGspec(A) v fe£spec(B)\{0} 

and S/(A||£) = (P 1 / 2 , / (L A R B -i) P 1/2 ) HS if and only if supp A < supp £ or lim, 
0. 



(2.6) 

/(*) _ 
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2.4 Remark. Note that L^R B -\ = A(A/B), given in the Introduction, and hence the /- 
divergence is a special case of the quasi-entropy (with K = I) when supp A < supp B or 
lim^+oo f{x)/x = 

2.5 Corollary. Let A, Ai, A 2 , B, Bi, B 2 and / be as in Definition 12.11 We have the following: 

(i) For every A G [0, +oo), 

S f (XA\\XB) = XS f (A\\B). 

(ii) If A° V £° JL A° 2 V £° then 

S>(Ai + A 2 \\B 1 + B 2 ) = S>(.Ai||£i) + S f (A 2 \\B 2 ). 

(iii) If V : "H — ?• /C is a linear or anti-linear isometry then 

S f {VAV*\\VBV*) = S f (A\\B). 

(iv) If x is a unit vector in some Hilbert space JC then 

S/(A<g) \x) (x\\\B ® |.x)(x|) = 

Proof. Immediate from (12. 6p . □ 

2.6 Remark. Note that if V is an anti-linear isometry then there exists a linear isometry 
V and a basis B such that VAV* = VA T V*, A G ^4 + , where the transposition is in the 
basis B. Hence, (iii) of Corollary 12.51 is equivalent to the /-divergences being invariant under 
conjugation by an isometry and transposition in an arbitrary basis. 

2.7 Example. Let f a (x) := x a for a > 0, x > 0. For a = 0, we define fo{x) := 1, x > 
0) /o(0) := 0. A straightforward computation yields that 



S fa (A\\B) = Tr A a B 1 ~ a + ( lim x a ~ l ] Tr A(I - B°) 

V x— s>+oo y 



(2.7) 



for any A,B G ^4+, and hence, if < a < 1 then 

V, 



S fa (A\\B) = TrA a B 1 -«, 



whereas for a > 1 we have 



Tr^f? 1 a , supp A < supp B, 
-oo, otherwise. 



The Renyi relative entropy of A and S with parameter aG[0,+oo)\{l}is defined as 

:=^-log%(A||E 
a — 1 

The choice f(x) :— x log x yields the relative entropy of A and i? 
where the second case follows from lim^+oo xlo f x = -j-oo. 



q. (A\\ m - j ^i^gTiA a B 1 a , suppA < supp5 or a < 1, 
-oo, otherwise. 



Tr A (log* A — log* B) , supp A < supp B, 
-oo, otherwise, 
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The following shows that the representing function for an /-divergence is unique: 

2.8 Proposition. Assume that a function D : A+ x A+ — > K can be represented as an 
/-divergence. Then the representing function / is uniquely determined by the restriction of 
D onto the trivial subalgebra as 

f(x) = S f (xI\\I)/dimH, iG[0,+oo). (2.8) 

In particular, for every D : A+ x A+ —> M. there is at most one function / such that D = Sf 
holds. 

Proof. Formula (12. 8 p is obvious from (12. 6p . and the rest follows immediately. □ 

In most of the applications, /-divergences are used to compare probability distributions 
in the classical, and density operators in the quantum case, and one might wonder whether 
there is more freedom in representing a measure as an /-divergence if we are only interested 
in density operators instead of general positive semidefmite operators. The following simple 
argument shows that if a measure can be represented as an /-divergence on quantum states 
then its values are uniquely determined by its values on classical probability distributions. 

Given density operators p and a with spectral decomposition p = J2aes P cc( P ) a ^o, and 
a = J2bespcc(a) bQb, we can define classical probability density functions (p : cr) 1 and (p : cr) 2 
on spec(p) x spec(er) as 

(p : a) x (a,b) := aTr P a Q b , (p : cr) 2 (a, b) := bTr P a Q b . 

This kind of mapping from pairs of quantum states to pairs of classical states was introduced 
in [36] , and is one of the main ingredients in the proofs of the quantum Chernoff and Hoeff ding 
bound theorems. 

2.9 Lemma. For any two density operators p, a and any function / as in Definition 12. 1[ 

S f (p\\a) = S f ((p : o~) l || (p : a) 2 ). 
Proof. It is immediate from (12. 6p . □ 

2.10 Corollary. Let / and g be functions as in Definition 12.11 If Sf and S g coincide on 
classical probability distributions then they coincide on quantum states as well. 

Proof. Obvious from Lemma [2.91 □ 

2.11 Example. For two density operators p, a, their quantum fidelity is given by F(p,o~) := 
Tr a/ p 1 / 2 ap 1 / 2 [55]. For classical probability distributions, the fidelity coincides with Sf 1/2 , 
where fi/2(x) = x 1 ^ 2 . If the fidelity could be represented as an /-divergence for quantum 
states then the representing function should be f\/2, due to Corollary 12.101 However, the 
corresponding quantum /-divergence is Sf 1/2 (p\\a) = Tr p 1 / 2 ^ 1 / 2 , which is not equal to F(p, a) 
in general. This shows that the fidelity of quantum states cannot be represented as an /- 
divergence. 

In Sections M and [7] we give similar non-represantability results for measures related to 
state discrimination on the state spaces of individual algebras. 

Our last proposition in this section says that the /-divergences are continuous in their 
second variable. Note that continuity in the first variable is not true in general. As a coun- 
terexample, consider A := B := P for some non-trivial projection P on a Hilbert space, and 
let f(x) := xlogx. Then S f (A + eI\\B) = +oo, e > 0, while S f (A\\B) = 0. 
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2.12 Proposition. Let A,B,B k G A with A,B,B k > for all k G N, and assume that 
linifc^oo Bk = B. Then 

lim S f (A\\B k ) = S f (A\\B). 

k—^oo 

Proof. By the definition (12.21) . we can choose a sequence e k > 0, k <E N, such that lim^oo e k = 
0, and for all k G N, 

S f (A\\B k + e*I) - i < S f (A\\B k ) < S f (A\\B k + e k I) + i 

if S/(A||.B fe ) is finite, and 

S)(A||fl fc + e fc J)>A; or S>(A||fl fc + e fc 7) < -fc 

if Sf(A\\B k ) = +oo or Sj(v4||£> fc ) = — oo, respectively. Let B k := B k + e k I, which is strictly 
positive for all fceN. Obviously, lim^oo B k = B, and the assertion will follow if we can show 
that 

hm S f (A\\B k ) = S f (A\\B). 

k—too 

Let A = £ aespec(A) aP a, B = EfcGspcc^) b Qb and B k = E cG s P cc(s fc ) be the spectral 
decompositions of the respective operators. Then 

SM\\Bk)= Yl E f{a/c)cTrP a QP. 

a€spec(A) cGspec(B fe ) 

From the continuity of the eigenvalues and the spectral projections when B k — > B, we see 
that, for every 5 > with 8 < | min{|6 — b'\ : b, b' G spec(-B), 6 7^ 6'}, if fc is sufficiently large, 
then we have 

spec(Sfc) C (b — 5, b + S) (disjoint union) 

fegspcc(B) 

and moreover, 

■= Qc } — > Qb as k -> +00, for all 6 G spec(fi). 

cS(f)-5,f)+<5) 

Assume that Sf(A\\B) G (—00, +00). Then by (12 .4p . it follows that u(f)a G (—00, +00) 
when a G spec(A) and P a Qo ^ 0. Due to (12.31) . for every e > there exists a 5 > as above 
such that, for a G spec (A), b G spec(-B) and c G spec (£?&), 

\f(a/c)c - f(a/b)b\ <e if 6 > and c G (6 - 5, b + 5), 
\f(a/c)c - w(/)a| < 5 if c G (0, 5) and P a Q ^ 0. 
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Hence, if k is sufficiently large, then we have 
\S f (A\\B k ) -S f (A\\B) | 

E E /(«/c)cTrP a g«- £ £ /(a/6)6TrP a g b 

2Gspec(A) c€spec(_B fc ) aSspec(A) bespec(B)\{0} 



E W (/)aTrP a g c 

aGspec(a) 



s E E 

agspec(A) feGspec(B)\{0} 



E f{a/c)cTrP a QW - f(a/b)b Tr P a g ; 



c£spcc(B^, ) 
cG(b-5,;>+6) 



E 



aGspcc(A) 



E /Wc)c Tr P Q gW -a;(/)aTrP a g c 



c£spcc(S^, ) 

ce(o,(S) 



< 



E | E l/(a/c)c-/(a/6)6|TrP a g( fc ) + |/(a/6)6TrP a (g[ fc) -g 



aSspec(A) fc£spec(B)\{0} I cespec(B fc ) 



+ E 1 E \f{a/c)c-u{f)a\TrP a Q 



aGspec(A) I cespoc(B fc ) 
ce(0,S) 



=> J- w(/)oH-P„(Qj*»-Q< 
E 

aSspec(A) b£spec(_B)\{0} aSspec(A) 



<5TrJ+ £ |/(a/6)6| 

aSsp 

This implies that 



Q?° - Qt 



[+ E l^(/) a l||^o fc) -Qo 



for every e > 0, and so 



limsup|5 , / (A||5 fc )-5 / (A||5)| < £ Tr/ 



lim ^(AHP,) = ^(AUS). 

fc— S-oo 



Next, assume that S/(A||P) = +oo. Then u)(f) = +oo and there is an ao G spec(A) \ {0} 
such that P a( ,go 7^ 0. For every s > there exists a 5 > as above such that, for a G spec(yl), 
b G spec(P) and c G spec(Pfe), 

\f(a/c)c - f(a/b)b\ <e if b > and c G (6 - 5, 6 + 5), 
f(a/c)c > 1/e if a > and c G (0, 5). 

Hence, if k is sufficiently large, then we have 
S f (A\\B k )> E E E (J(a/b)b-e)TtP a Q? 



aSspec(A) 6Sspec(_B)\{0} ces P ec(B fe ) 
ce(b-6,b+6) 



E (-1/(0)15) Tr P gW+ E E (V^)Tr P a g 



(*) 

c 



cGspec(B^) 

ce(o,s) 



a£spec(A) cSspec(B fe ) 
a>0 ce(0,i) 



>-(Tr/) E l/(«/^-^|-|/(0)|5TrP o g( fc) + (l/e)TrP ao g^ 

aGspec(A) bespec(S)\{0} 



(fc) 
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which implies that 

]imw£S f {A\\B k )>-(TrI) V V \f{a/b)b - e\ - \f(0)\STr P Q + (1/e) TrP ao Q . 

k — yoo ' 

aGspec(A) b6spcc(B)\{0} 

Since TrP ao Qo > and both e > and 5 > can be chosen to be arbitrarily small, we have 

lim S f (A\\B k ) = +oo = S f (A\\B). 

k— >oo 

The case where Sf(A\\B) = — oo is similar. □ 

3 Preliminaries on positive maps 

Let Ai C B(T-Lj) be finite-dimensional C*-algebras with unit for % = 1,2. For a subset 
B C Ai, we will denote the set of positive elements in B by B + ; in particular, ^4 ii+ denotes 
the set of positive elements in Ai. For a linear map $ : Ai — > A2, we denote its adjoint with 
respect to the Hilbert-Schmidt inner products by $*. Note that $ and $* uniquely determine 
each other and, moreover, $ is positive/n-positive/completely positive if and only if $* is 
positive/n-positive/completely positive, and $ is trace-preserving/trace non-increasing if and 
only if $* is unital/sub-unital. 

For given B G Ai )+ and $ : Ai — >■ -4. 2 , we define $5 : A\ -> A 2 and $^ : A 2 -> -4i as 

$ B (X) := $(5) _1 / 2 $(i? 1 / 2 Xi? 1 / 2 )<l>(i?) _1//2 , leii, (3.1) 
: = b 1 ^* ($(5)- 1/2 F$(fi)- 1/2 ) 5 1/2 , F G A- (3.2) 

With these notations, we have ($#)* = &* B and ($5)* = $b- 

For a normal operator X G A\, let P{i}(X) denote the spectral projection of X onto its 
fixed-point set. Note that if B G A\ y+ then B° is a projection in A\ and hence .B .4.i-B is a 
C* -algebra with unit 

3.1 Lemma. If $ : A\ — > A 2 is a positive map and A, i? are positive elements in A\ such 
that A = B° then = $(5)°. In particular, $(5)° = $(5°)° for any positive B G A. 

Proof. The assumption A = £?° is equivalent to the existence of strictly positive numbers 
a,/3 such that aA < B < /3A, which yields a$(A) < $(5) < (3$ (A) and hence = 
$(5)°. □ 

3.2 Lemma. Let B G and let $ : A -»■ A 2 be a positive map such that $*($(B)°) < ^ 
(in particular, this is the case if $ is trace non- increasing). Then 

Tr$(5) < TrB, 

and the following are equivalent: 

(i) Tr$(£) = Trfi. 

(ii) For any function / on spec(-B) such that f(0) = 0, we have 

f(B)<S>*(<S>(B)°) = &(<S>(B)°)f(B) = f(B). 
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(iii) 5°<P {1} ($*($(5) )). 

(iv) $ is trace-preserving on B°A\B°. (In particular, if A G *4.i,+ is such that A < B° then 
Tr$(A) = Tr A) 

(v) For the map Q* B given in (I3.2p . we have 



Proof. By assumption, < Ji and hence, 

< Tr^i-^^^^) ))^ = TrE-Tr'T^P) ^ = Tr J B-Tr$( J B)°$( J B) = Tr B -Tr $(£). 
If Tr$(S) = Tr5 then (Jj - $*($(i?) ))£? = 0, i.e., 5 = $*($(£?) )£?, so we get B n = 



§*(§(B)°)B n , n G N, which yields [(IT)] Hence, the implication [(I)|=>[(ii)1 holds. If |(ii)| holds 



then we have B° = $*($(5)°)S and hence, for any x G 7i such that B°x = x, we have 
x = B°x = $*($(B)°)B°x = &*(&(B)°)x, or equivalently, x G ranP^} ($*($(£?) )). This 



m 



yields (iii) , and the converse direction 
X G then XB° = B°X = X, and 



n 



is obvious. Assume now that 



li 



holds. If 



Tr$(X) = Tr$(X)${B)° = Tr X$*($(5)°) = Tr XB°<$>*{${B)°) = TrXB = TrX, 



showing 



iv 



Assume that 



The implication (iv) => (i) is obvious. 

_j holds. Then $^($(B)) = B 1 / 2 ®* ($(B)°) 5 1/2 = 5, showing [(v) 
On the other hand, if gvj] holds then I? 1 ^* ($(5)°) Z? 1 / 2 = 5, and hence = B l l 2 (h - 
$* (<$>(Bf))B l l 2 . Since 7i - $* > 0, we obtain B l ' 2 {h - $* (^(B) )) 1 / 2 = 0, which 

in turn yields B = ($(JB)°). From this 



follows as above. 



□ 



3.3 Corollary. Let A,B£ A,+, and let $ : A\ — > A2 be a trace non-increasing positive 
map. Then $ is trace-preserving on (A + 5)° v 4i(y4 + i?)° if and only if 



Tr $(A) = Tr A and Tr $(5) = Tr B. 
Proof. Obvious from Lemma 13.21 



□ 



3.4 Corollary. Let A, B G .4.1,+ and let $ : *4.i — > A2 be a trace non- increasing positive map 
such that Tr$(A) = Tr A Then 

Tr$(£)$(A)° > TrBA and Tr$(B)(I 2 - $(A)°) <Tr B(h - A ). 

Note that the first inequality means the monotonicity of the Renyi 0-relative entropy <So(A]|-B) > 
So($>(A)\\<&(B)) under the given conditions. 

Proof. Due to Lemma EOl the assumptions yield that A° < P {1} ($*($(A) )) < $*($(A)°), 
and hence < Tr £($*($ (A) ) - A°) = Tr $( J B)$(A)° - TrBA . The second inequality 
follows by taking into account that Tr$(£>) < Tr£>. □ 



The following lemma yields the monotonicity of the Renyi 2-relative entropies, and is 
needed to prove the monotonicity of general /-divergences. The statement and its proof can 
be obtained by following the proofs of Theorem 1.3.3, Theorem 2.3.2 (Kadison's inequality) 
and Proposition 2.7.3 in [5] using the weaker conditions given here. For readers' convenience, 
we include a self-contained proof here. 
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3.5 Lemma. Let A,B& A\ i+ and $ : A\ — > A% be a positive map. Then 



§{B* AB*)$(B)-^{B* AEP) < §(B® AB" 1 AB Q ). (3.3) 
In particular, if A < B° then 

$(A)$( J B)" 1 $(A) < ^(AB^A). (3.4) 

If, moreover, <3> is also trace non-increasing then 

S f2 ($(A)\\<S>(B)) = Tr$(A) 2 $(B)- 1 < Tr A 2 B _1 = S h (A\\B). (3.5) 

Proo/. Define : A ->• ^ 2 as #(X) := ^(P^xp, 1 / 2 ), x e ^. Let X := B^AB' 1 / 2 and 
let X = J2 X £a(x) ^ e its spectral decomposition. Then 



X : = 

and hence we have 

< YXY* 

where 



#(X 2 ) V(X) 



E 



2 

ar x 
x 1 



^ra > o, 



$(x 2 ) - *(x)^(/ 1 )- 1 *(x) #(x)(i 2 - # (J) )' 

(J 2 - (A) )* (X) 



y := 



J 2 -^(x^p)- 1 

J 2 



Hence #(X 2 ) > ^f(X)^(I 1 )- 1 ^f(X), which is exactly (Q. The inequalities in ([31} and fl33|) 
follow immediately. □ 



We say that a map $ : A\ — > A2 is a Schwarz map if 

||$|| s := inf{c G [0, +00) : $(X)*$(X) < c$(X*X), lGi}< +00. 

Obviously, if $ is a Schwarz map then $ is positive, and we have ||$|| = ||$(/i)|| < ||$||o- 
(Note that ||$|| = ||$(/i)|| is true for any positive map $ [SJ Corollary 2.3.8]). We say that 
$ is a Schwarz contraction if it is a Schwarz map with \\$\\ s < 1. A Schwarz contraction $ 
is also a contraction, due to ||$|| < Note that a positive map $ is a contraction if and 

only if it is subunital, which is equivalent to $* being trace non-increasing. We say that a map 
$ between two finite-dimensional C*-algebras is a substochastic map if its Hilbert-Schmidt 
adjoint $* is a Schwarz contraction, and $ is stochastic if it is a trace-preserving substochastic 
map. Note that in the commutative finite-dimensional case substochastic/stochastic maps are 
exactly the ones that can be represented by substochastic/stochastic matrices. 

It is known that if $ is 2-positive then it is a Schwarz map with \\&\\ s = ||$||- In general, 
however, we might have ||$|| < \\<&\\ s < +00, as the following example shows. In particular, 
not every Schwarz map is 2-positive. 

3.6 Example. Let % be a finite-dimensional Hilbert space, and for every e € M., let $ £ : 
B{U) -> B(H) be the map 

$ £ (X) := (1 - e)X T + e(Tr X)I/d, X G B(H), 
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where d := dim'H > 1 and X T denotes the transpose of X in some fixed basis {ey, . . . , e^} of 
Ti. It was shown in [51] that $ e is positive if and only if < e < 1 + — 1), for k > 2 it 
is /^-positive if and only if 1 — 1) < £ < 1 + l/(d — 1), and it is a Schwarz contraction 

if and only if 1 - 1/ (l/2 + \/ d + 1 /ij < £ < 1 + - 1). This already shows that there 
are parameter values e for which <3> £ is a Schwarz contraction but not 2-positive. Moreover, if 
£ G [0, 1) then for every c G [0, +oo) we have 

c$ £ (X*X) - $ £ (X*)$ £ (X) 

= c(l - e)(X*X) T + c£(Tr X*X)//d - (1 - £) 2 (X*) T X T 

- £(1 - e)(TrX)(X*f/d - e(l - s)(TiX*)X T /d - e 2 \ Tr X\ 2 I/d 2 

> (Tr X*X)I/d \ce - d{\ - ef - 2e{l - e)Vd - e 2 

where we used that | Tr X| 2 < (Tr/)(TrX*X) and X*X < \\X\\ 2 I < (TrX*X)I. This shows 
that $ e is a Schwarz map for every e G (0,1) and ||$ e || s < (l/e)(d(l-e) 2 + 2e(l-e)Vd + e 2 ). 
Note that for X := |ei) (e2 1 we have 

< ( ei , (||$ 6 || s $ E (X*X) - $ £ (X*)$ £ (X)) ei ) = \\$ E \\ s e/d - (1 - £) 2 , 

which yields that ||3> e || s > d(l — e) 2 /e. In particular, lim e ^o ll^elU = +oo- Since $ e is 
a positive unital map for every e G [0, 1 + l/(d — 1)], we have ||$ e || = 1 for every e G 
[0,1 + l/(d — 1)], while ||$ e || s > 1 and hence ||$ e || < ||$ 6 || g whenever (1 -efje > d. 
Similarly, it was shown in [51] that the map 

V e (X):=(l-e)X + e(TrX)I/d, X G B(%), 

is completely positive if and only if < e < 1 + l/(<i 2 — 1), for 1 < k < d — 1 it is k- 
positive if and only if0<£< 1 + 1/ (dk — 1), and it is a Schwarz contraction if and only if 
0<£< 1 + 1/d. A similar computation as above shows that \l/ e is a Schwarz map if and only 
if < £ < 1 + l/(d- 1), and lim £/ . 1+ i /(d „i) ||* E || 5 = +oo. 
Finally, the map 

A £ (X) := (1 -e)X T + eX, X G B(H), 

is positive if and only if < £ < 1, it is A>positive for k > 2 if and only if e = 1 if and 
only if it is a Schwarz contraction [5T]. Moreover, for X := |ei) (e2 1 and every c G R we have 
(ei, (cA £ (X*X) - A £ (X*)A £ (X)) ei) = -(1 - £) 2 , and hence A e is a Schwarz map if and only 
if e = 1. 

3.7 Lemma. Let $ : A± — > Ai be a substochastic map, and assume that there exists a 
5 G A,+ \ {0} such that Tr$(5) = Tr£. Then ||$*|| s = ||$*|| = 1. 

Proof. Let Ay := ^ 2 := $(fi)M 2 <l>(fi) , and define I : ii 4 i 2 as <£(X) := 

<5>(B°XB°) = $(X), X G Ay. Then $*(y) = 5°$*(y)B°, Y G and Lemma [3j yields 
that $*($(i?) ) = i.e., $* is unital. Hence, 1 = ||$*|| < ||$*|| < ||$*|| s < 1, from which 
the assertion follows. □ 

3.8 Lemma. The set of Schwarz maps is closed under composition, taking the adjoint, and 
positive linear combinations. Moreover, for a > and $, $i, $2 : Ay — > A2, 

\\a$\\ s = a\m s , \\&i + &2\\s<\\$i\\s+\\$2\\ s . (3.6) 
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Proof. The assertion about the composition is obvious. To prove closedness under the adjoint, 
assume that $ : Ai — > A 2 is a Schwarz map. Our goal is to prove that $* is a Schwarz map, 
too. Let Lk be the trivial embedding of Ak into B{Hk) for k — 1,2. The adjoint Hk := i* k of ik is 
the trace-preserving conditional expectation (or equivalently, the Hilbert-Schmidt orthogonal 
projection) from £>(%&) onto Ak- Since Lk is completely positive, so is 7Tk, and since irk is unital, 
it is also a Schwarz contraction. Let $ := i 2 o $ o 7Ti, the adjoint of which is $* = £1 o $* o 7r 2 . 
Note that $ is a Schwarz map, too, with || <L> || ^ = ||$|| s , since for any X G B(Hi), 

$(x*)4>(x) = l 2 ($( 7 r 1 (x*))$(7r 1 (x))) < ||$|| 5 i2 $ (mix^ix)) < \\$\\ s 4>(x*x). 

Hence, for any vector v G Hi and any orthonormal basis {ej}^i =1 in Hi, we have 

\M s $(\v)(v\) > $(\v)(e, t \M\e t )(v\), i = l,...,d u 

where <ii := dim?^. Let 7 6 be arbitrary. Multiplying the above inequality with Y from 
the left and Y* from the right, and taking the trace, we obtain 

\\$\\ s (v,$*(Y*Y)v) = \\Q\\ s TrY$(\v)(v\)Y* > TrY$(\v)(ei\)$(\ei)(v\)Y*. 

Note that Tr : A 2 — > C is completely positive, and hence it is a Schwarz map with ||Tr|| 5 = 
||Tr(/ 2 )|| = d>2 := dim?^. Hence, the above inequality can be continued as 

d 2 \m s (v, $*(Y*Y)v) > Tr Y$(\v)(ei\) Tr $(|e<) (v\)Y* = (v, $* (Y*)*) $*(>», 

and summing over % yields 

did 2 \\$\\ s (v,$*(Y*Y)v) > (v,®*(Y*)$*(Y)v). 

Since the above inequality is true for any v G Hi, and $*(Y) = $*(Y) for any Y £ A 2 , the 
assertion follows. 

The assertion on positive linear combinations follows from (13.61) . and the first identity 
in (13.61) is obvious. To see the second identity, assume first that $1 and $2 are Schwarz 
contractions. Then, for any e G [0, 1] and any X G Ai we have 

((1 - e)$j + £$ 2 ) (X*X) - ((1 - + £$ 2 ) (X*) ((1 - e)$j + £$ 2 ) (X) 
= (1 - e) - + £ [$ 2 (X*X) - $ 2 (X*)$ 2 (X)] 

+ e(l - e) pxpT) - $ 2 (X))* (^(X) - $ 2 (X))] > 0, 

and hence (1 — + e$ 2 is a Schwarz contraction for any e G [0, 1]. Finally, let $1, $ 2 : 
Ai —> A 2 be non-zero Schwarz maps. Then := ||$fe|| s is a Schwarz contraction for 
k = 1,2, and choosing e := || < & 2 || 5 / + || < l ) 2 || iS ), we get 

||$i + $ 2 || s = drills + \\Q2Ws) 11(1 +£®2\\s < \\$i\\s+ \\®2\\s- n 

Lemma 13.91 and Corollary 13.101 below are well-known when $ and 7 are unital 2-positive 
maps. Their proofs are essentially the same for Schwarz contractions, which we provide here 
for the readers' convenience. 

3.9 Lemma. Let $ : Ai — > A 2 be a Schwarz map, and let 

M$ := {X G Ai : $(X)$(X*) = \\Q\\ S $(XX*)}. 

Then 

XeM® if and only if $(X)$(Z) = ||$|| s Q(XZ), Z G Ay. (3.7) 
Moreover, the set Ai$ is a vector space that is closed under multiplication. 
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Proof. We may assume that \\&\\ s > 0, since otherwise $ = and the assertions become 
trivial. Define j(X h X 2 ) : = ||$|| s $(Xi^) - ^(X 1 )^(X 2 )*, X U X 2 G A x . Let X G M*, 
Z e Aiand te R. Then 

< -y(tX + Z,tX + Z)= t 2 7 (X, X) + t[y(X, Z) + 7 (Z, X)] + 7 (Z, Z) 
= t[7(X,Z)+ 7 (Z,X)]+ 7 (Z,Z). 

Since this is true for any t G R, we get 7 (X, Z)+ r y(Z, X) = 0, and repeating the same argument 
with iZ in place of Z, we get y(X, Z) - j(Z, X) = 0. Hence, $(X)$(Z) = \\$\\ S $(XZ). The 
implication in the other direction is obvious. The assertion about the algebraic structure of 
follows immediately from (13. 7p . □ 

For a map 7 from a C*-algebra into itself, we denote by ker (id —7) the set of fixed points 
of 7. 

3.10 Corollary. Let 7 : A — > A be a Schwarz contraction, and assume that there exists 
a strictly positive linear functional a on A such that 007 = a. Then ||7|| s = ||7|| = 1, 
ker (id —7) is a non-zero C*-algebra, 7 is a C*-algebra morphism on ker (id —7), and 700 : = 
lim^oo - X]fc=i l k is an a-preserving conditional expectation onto ker (id —7) . 

Proof. The assumption a o 7 = a is equivalent to 7* (A) = A, where a(X) = Tr AX, X G A, 
and A is strictly positive definite. Thus 1 is an eigenvalue of 7* and therefore also of 
7. Hence, the fixed-point set of 7 is non-empty, and it is obviously a linear subspace 
in A, which is also self-adjoint due to the positivity of 7. If X G ker (id —7) then < 
a (j(X*X) - 7 (X*) 7 (X)) = a (l(X*X)) - a(X*X) = 0, and hence i(X*X) = 1 (X*) 1 (X) = 
X*X, i.e., X*X G ker (id —7). The polarization identity then yields that ker (id —7) is closed 
also under multiplication, so it is a C*-subalgebra of A. Let / be the unit of ker (id —7); then 
1 = ||/|| = || 7 (/)|| < ||7|| < IMIg < 1, so H7II5 = 1. Repeating the above argument with 
X* yields that ker (id— 7) C M. 1 H -M*, where ,M 7 is defined as in Lemma 13.91 Moreover, 
by Lemma 13.9} 7 is a C*-algebra morphism on .M 7 D Ai*, and hence also on ker (id— 7). 
Note that (X,Y) := a(X*Y) defines an inner product on A with respect to which 7 is a 
contraction, and hence 700 exists and is the orthogonal projection onto ker (id —7), due to von 
Neumann's mean ergodic theorem. By Lemma \3 .91 we have j(XY) = j(X)j(Y) = Xj(Y) for 
any X G ker (id —7) and Y G A, which yields that 7^ is a conditional expectation. □ 

3.11 Lemma. Let B\ := B G Ai t + be non-zero, and let $ : A\ — > A 2 be a trace non- 
increasing 2-positive map such that Tr$(5) = Tri?. Let B 2 : = &(B). Then there exist 
decompositions supp£> m = ^ r k=1 'Hm,k,L ®Hm,k,R, m = 1,2, invertible density operators uJs,k 
on Hx t k,R an d CiJB,k on H 2t k,R, and unitaries U k '■ Hi,k,L — > W2,k,L such that 

r 

ker (id o $) + = B(Hi, ktL )+ ® u) Bjk , 

k=l 

<1>(A 1AL <g> u Byk ) = U k A 1AL U* k ® Co BM , A 1AL G B(U 1AL ). (3.8) 

Proo/. Let A x := A 2 := <S>(B)°A 2 <f>(B)°, and define 4> : A x ->■ A as $(X) : = 

f(B°X5 ) = $(X), X G A- Then $*(F) = 5°$*(F)5°, F G A, and a straightforward 
computation verifies that 4> B (X) := $(5)- 1 / 2 <l(5 1 / 2 X5 1 / 2 )<l(5)- 1 / 2 = $ B (X), X G A, 
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and $|(Y) := B 1 / 2 ^*{^(B)- 1 l 2 Y^{B)- l l 2 )B 1 / 2 = <f>* B (Y), Y G A 2 . Let 7l := <f>* o $ B and 
72 := $b ° Obviously, 71 and 72 are again 2-positive and, since 

7l (B°) = $*($(B)°) = B $*(<1>(B) )B = B°, 
72 ($(B)°) = $(B)- 1/2 $(B 1/2 $*($(B)°)5 1/2 )$(5)- 1/2 = $(B)° 

due to Lemma l3.2[ they are also unital. Hence, ||7i||g = ||7i|| = 1, i — 1,2. Note that if 
Ai := A e ker (id -$* B o $) + then A < B° and hence A e j^, and 

7 *(A + B) = $b($(A + B)) = A + B, 7 2 *($(^ + B)) = $(<^($(A + B))) = $(A + B). 

Let A2 := $(Ai). By the above, 7 m leaves the faithful state a m with density (A m + 
B m )/Tr(74 m + B m ) invariant, and hence, by Corollary 13. 10^ ker (id— 7m ) is a C*-algebra 
of the form ker (id -7™) = 0£ =1 B{H mjktL ) ® I m ,k,R, where 0^ =1 H mifcjL <8> H mjfcijR is a de- 
composition of suppB m . Moreover, lim n ^oo - Ylk=i 7m gives an a m -preserving conditional 
expectation onto ker (id — j m ), for m = 1,2. Hence, by Takesaki's theorem [49], (A m + 
B m ) lt ker (id — 7m ) (A m + B m )~ l * = ker (id — 7m ). Now the argument of Section 3 in [33] yields 
the existence of invertible density operators UA,B,k on H\^,r an d positive definite operators 
Xi,k,L,A,B on ^i\,k,L such that A + B = 0^ =1 X X ^ L ^ B ® u) A>B>k . By Theorem 9.11 in [32], we 
have (A + B) lt B~ lt e ker (id —71) for every t e R, which yields that UA,B,k is independent of A, 
and hence that every A e ker (id — $g o <J>) can be written in the form A = 0£ =1 ^4i,fc,L®WB,fc 
with := UA,B,k and some positive semidefinite operators Ai^,l ° n %i,fc,L- This shows that 
ker (id— o $) c 0^ =1 B('Hi j k ) L)+ <8> ^B,fc- For the proof of ( 13. 8p . we refer to Theorem 
4.2.1 in [52]. Finally, the decomposition B = ® r k=1 Bi^ L <g> u^^ together with (I3.8P shows that 
ker (id -$* B o $) + d 0^ =1 B(H 1AL )+ ® cu B)fe . □ 

4 Monotonicity 

Now we turn to the proof of the monotonicity of the /-divergences under substochastic maps. 
Let Ai C B(Jli) be finite-dimensional C*-algebras for % = 1,2. Recall that we call a map 
$ : Ai — > Ai substochastic if $* satisfies the Schwarz inequality 

$*(y*)$*(Y) < $*(Y*Y), Y e Ai-, 

and $ is called stochastic if it is a trace-preserving substochastic map. 

For a B G *4i,+ and a substochastic map $ : Al — > *4. 2 , we define the map V : .4.2 — ► *4i 

as 

V(X) := $*(X$(B)~ 1/2 )B 1/2 , X G A- (4.1) 
Note that V = R B i/2 o $* o R^r B yy 2 and hence V* = R^^-1/2 o $ o R B i/2, which yields 

V*(B 1/2 ) = $(B) 1/2 . (4.2) 
4.1 Lemma. We have the following equivalence: 

V($(B) 1/2 ) = B 1/2 if and only if Tr $(B) = Tr B. 
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Proof. By definition, 



V^Bf/ 2 ) = $*($(5) 1 / 2 $(fi)^ 1 / 2 )fi 1 / 2 = <F($(5)°)5 1 / 2 . 

Hence, if Tr$(B) = Tr B then V{^{B) 1 ' 2 ) = B 1 / 2 due to Lemma O On the other hand 
B i/2 = v($(By/ 2 ) = $*($(_B)°)i? 1 / 2 yields $*($(£)°)£ n = fi n , n e N, and hence also 



of Lemma [3.21 which in turn yields Tr 3>(-B) = Tr B. □ 

4.2 Lemma. The map V is a contraction and 

V* (L A R B -i) V < L MA) R HB) -i. (4.3) 

Moreover, when $* is a C*-algebra morphism, V is an isometry if <&(B) is invertible, and (14. 3 p 
holds with equality if I? is invertible. 

Proof. Let Iei 2 . Then, 

II^XUhs = Tr(VX)*(VX) = Tr B 1/2 $*($(By 1/2 X*)$*(X$(B)- 1/2 )B 1/2 

< \\$*\\ s TrB l/2 $*($(B)- 1/2 XX*$(B)- 1/2 )B 1/2 (4.4) 
= \\$*\\ s Tr$(B)$(B)- 1/2 XX*$(B)- l/2 = \\$*\\ s Tr$(B)°XX* 

<\\$*\\ s TrXX* = \\$*\\ s \\X\\ 2 <\\X\\ 2 . (4.5) 



If $* is a C*-algebra morphism then ||$* || 5 = 1 and the inequality in (14.41) holds with equality, 
and if $(-B) is invertible then and the inequality in (14.51) holds with equality. Similarly, 

(X, V* (L A R B -i) VX) m = Tr(VX)*A(VX)B- 1 

= Tr B 1/2 $*(<S>(B)- 1/2 X*)A<5>*{X<5>(B)- 1/2 )B 1/2 B- 1 
= TrA<S>*(X<5>(B)- 1/2 )B°<S>*(<S>(B)- 1/2 X*) 

< Tr A$*(X$(B)- l/2 )$*($(B)- l/2 X*) (4.6) 

< ||$*|| s Tr A$*(X$(B)- 1/2 $(B)- 1/2 X*) (4.7) 
= W&WsTrQWXQiB^X* = \\$*\\ s (X, L^ {A) R HB) -iX) ns 

< (X, L$( J 4)i?$( B )-lX) H S- (4.8) 

If $* is a C*-algebra morphism then = 1 and the inequalities in (14. 7p and ( 14. 8 p hold 

with equality, and if B is invertible then (14.61) holds with equality. □ 

Recall that a real-valued function / on [0, +oo) is operator convex if f(tA + (1 — t)B) < 
tf(A) + (1 — t)f(B), t G [0, 1], for any positive semi-definite operators A,B on any finite- 
dimensional Hilbert space (or equivalently, on some infinite-dimensional Hilbert space). For a 
continuous real- valued function / on [0, +oo), the following are equivalent (see [T31 Theorem 
2.1]): (i) / is operator convex on [0, +oo) and f(0) < 0; (ii) f(V*AV) < V*f(A)V for 
any contraction V and any positive semi-definite operator A. The function / is operator 
monotone decreasing if f{A) > f(B) whenever A and B are such that < A < B. If / is 
operator monotone decreasing on [0, +oo) then it is also operator convex (see the proof of 
[TBI Theorem 2.5] or [H Theorem V.2.5]). A function / is operator concave (resp., operator 
monotone increasing) if — / is operator convex (resp., operator monotone decreasing). An 
operator convex function on [0, +oo) is automatically continuous on (0,+oo), but might be 
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discontinuous at 0. For instance, a straightforward computation shows that the characteristic 
function l{o} of the set {0} is operator convex on [0, +oo). It is easy to verify that the 
functions 

<p t ( x ) := — = -1 + — (4.9) 

Yy 1 x + t x+t y ' 

are operator monotone decreasing and hence operator convex on [0, +oo) for every t G (0, +oo). 

4.3 Theorem. Let A, B G Ai ; +, let $ : A\ — > A2 be a substochastic map such that Tr $(B) = 
Tr B, and let / be an operator convex function on [0, +00). Assume that 

Tr = TyA or < cj(f). (4.10) 

Then, 

S f (*(A)\\*(B)) < S f (A\\B). (4.11) 

Proof. First we prove the theorem when / is continuous at 0. Due to Theorem 18.11 we have 
the representation 

f(x) = f(0) + ax + bx 2 + / (JL- + <p t ( x )\dft(t), ie[0,+oo), 

J (0,00) V 1 + 1 J 

where b > and <pt{z) is given in ( 14. 9p . Define 

A := LaR^-i and A := L^^R^^-i . 

Then 

S f (A\\B) =/(0) Tr B + a Tr AB° + b Tr A 2 B~ l 



/ (- 1 — + S vt (A\\B)) d^(t) + ujU)^A{I-B°). 

J(0,+oo) \ 1 + Z J 



'(0,+oo) 

Note that Tr B = Tr <&(B) by assumption. Since tpt is operator convex, operator monotonic 
decreasing and (ft(0) = 0, we have 

V*<p t (A)V > <p t (V*AV) > tpt(A) (4.12) 

for the contraction V defined in f)4.ip . due to (14. 3 p and [131 Theorem 2.1] as mentioned above. 
Hence, by Lemma [4. 1\ 

S vt (A\\B) = ( j B 1/2 ,^(A)5 1/2 )hs = (V$( J B) 1 / 2 ,^(A)V$( J B) 1 / 2 ) HS 

> ($(i?) 1/2 , (p t (A)§(B) 1/2 ) HS = S Vt (<S>(A)\MB))- (4-13) 

Therefore, in order to prove the monotonicity inequality (14. lip , it suffices to prove that 

aTrAB > aTi$(A)$(B)°, (4.14) 
bTiA 2 B- 1 > 6Tr$(A) 2 $( J B)^ 1 , (4.15) 
u(f) Tr A(h - B°) > cu(f) Tr $(A)(/ 2 - $(5)°). (4.16) 

Assume first that suppA < suppi?, and hence also Tr$(A) = Tr A (see Lemma 13.21) . 
Then both sides are equal to zero in (14161) . and Tr $(A)$(5)° = Tr$(A) = Tr A = Tr AB° 
yields that (14.141) holds also with equality. Finally, since b > 0, (I4.15P follows by Lemma [3.51 
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Next, assume that Tr$(A) = Tr A, and define B £ := B + sA, e > 0. Then Tr$(B e ) = 
Tr $(-B) + £ Tr = Tr B + e Tr A = Tr i? £ , and supp A < supp I? £ . Hence, by the previous 
argument, Sf(&(A)\\$(B E )) < Sf(A\\B E ). Taking e \ and using Proposition 12. 12[ we obtain 

dHU). 

If oo(f) = +oo, then either supp A ^ supp-B, in which case 

S f (A\\B)=+ 00 >S f (*(A)\\*{B)) 7 

or we have supp A < supp I? , and hence (14. lip follows by the previous argument. 

Finally, assume that < u(f) < +oo. By Proposition 18.4^ this yields the representation 

f{x) = f(0)+u{f)x + / <Pt{x)d!i{t), 

J{Q,oo) 

and hence 

S f (A\\B) = /(0) Tr B + u(f) Tr AB° + / S vt (A\\B) dfi(t) + Tr A(I - B°) 

J(0,+oo) 

= f(0)TrB + u;(f)TrA+ [ S Vt (A\\B) dfj,(t). 

J(0,+oo) 

Since Tr$(v4) < Ty A, inequality f)4.1ip follows. 

So far, we have proved the theorem for the case where / is continuous at 0. Consider the 
functions f a {x) := —x a , x > 0, < a < 1. Then f a is operator convex, continuous at and 
^{fa) — for all a G (0, 1). Hence, by the above, we have 

-Tr^A) ^) 1 - = Sj a ($(A)\\$(B)) < S fa (A\\B) = — Tr A a B 1 ~ a , a G (0,1). (4.17) 

Taking the limit a \ 0, we obtain 

Tr &(A)°$(B) > Tr A°B, (4.18) 

which in turn yields 

S 1{0} mA)\\$(B)) = Tr$(fi) -Tr$(A)°$(fi) < Tr 5 — Tr A°B = S 1{0} (A\\B). (4.19) 

Assume now that / is an operator convex function on [0,+oo), that is not necessarily 
continuous at 0. Convexity of / yields that /(0 + ) := ]im x \^ f(x) is finite, and a := /(0) — 
/(0 + ) > 0. Note that / := / — al{o} is operator convex and continuous at 0, oo(f) = oo(f), and 
Sf(A\\B) = Sf(A\\B) + aSi {0} (A\\B) for any A,B G A\ t +. Applying the previous argument 
to / and using (I4.19p . we see that 

Sf(&(A)\\§(B)) = Sj($(A)\\§(B)) + aS 1{Q} (§(A)\\$(B)) 
<S f (A\\B) + aS 1{0} (A\\B)=S f (A\\B) 

if any of the conditions in (I4.10p holds, completing the proof of the theorem. □ 
4.4 Remark. Note that supp A < supp B is also sufficient for ( 14. lip to hold, due to Lemma 
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4.5 Example. Let A,B£ A\ t + and $ : A± — > A 2 be a substochastic map such that 
Tr$(£>) = Tr£?. Let sgnx := x/\x\, x 7^ 0, and define / a := sgn(a — l)f a , < a / 1, where 
f a is given in Example 12.71 Since f a is operator convex, and u)(f a ) > for all a G [0, 2] \ {1}, 
Theorem 14.31 yields that 

sgn{a - l)Tr ^(A^iB) 1 - = 5 /q ($(A)||$(B)) 

< S/ o (A||B) = sgn(a - 1) TrA a B 1 ~ a (4.20) 

when a G (1,2] and suppA < supp5. (Note that Sf a ($(A)\\$(B)) < Sj a (A\\B) = +00 is 
trivial when a G (1,2] and suppA ^ suppi?.) The same inequality has been shown in the 
proof of Theorem 14.31 for a G [0, 1); see (I4.17P and (14.181) . This yields the monotonicity of the 
Renyi relative entropies, 

S a {9(A)\\9{B)) = log S fa (HA)\MB)) < -J- \ogS fa (A\\B) = S a (A\\B) (4.21) 

a — 1 a — 1 

for a G [0,2] \ {1}. 

Since ou(f) > for f(x) := xlogx, Theorem 14 . 31 also yields the monotonicity of the relative 
entropy, 

S(${A)\mB)) < S(A\\B). 

4.6 Remark. In the proof of Theorem 14.31 it was essential that / is operator convex, but it is 
not known if it is actually necessary. See Appendix [A] for some special cases where convexity 
of / is sufficient. 

Theorem 14.31 yields the joint convexity of the /-divergences: 

4.7 Corollary. Let A iy Bi G A + and > for % — 1, . . . , r, and let / be an operator convex 
function on [0, +00). Then 

Proof. Let 5i,...,8 r be a set of orthogonal rank-one projections on C r , and define A := 
JJ i=1 PiAi ® Si, B := Y?i=iPi B i ® The ma P $ : ^ ® W) -»■ A, given by $(X ® 
Y) := X TrY, X £ A, Y £ B(C r ), is completely positive and trace-preserving and hence, by 
Theorem 14.3} 

S f (J2^M\ Ei» B *) = 5/(<&(A)||<&(5)) < S/CAIIB) = ^PiMMlBi), (4.22) 
where the last identity is due to Corollary 12.51 □ 

4.8 Remark. For an operator convex function / on [0, +00) let M.f(A±, A 2 ) denote the set 
of positive linear maps $ : Ai — > A2 such that the monotonicity Sf(<&(A)\\<&(B)) < Sf(A\\B) 
holds for all A, B G A±. The joint convexity of the /-divergences shows that Aif(Ai, A2) is 
convex. Indeed, if $i,$2 G Aif(Ai, A2) then Corollary 14. 71 yields 

S>((1 - X)^i(A) + A$ 2 (A)||(1 - A)$i(B) + A$ 2 (B)) 

< (1 - A)S , / ($i(.4)||$ 1 (i?)) + \S f ($ 2 (A)\\$ 2 (B)) 

< (1 - A)^(A||S) + A^(A||S) = fi>(A||B) 
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for any A G [0, 1] and A,B E A- Note also that if $1 G M f (A 1 ,A 2 ) and $ 2 G A4/(A, A3) 
then $ 2 $1 G X/(A, A)- 

We say that a linear map $ : A — > A2 is a co-Schwarz map if there is a c G [0, 00) such 
that 

$pf*)$(X) < c$(ir), X G A, 

and it is a co-Schwarz contraction if the above inequality holds with c = 1. It is easy to see 
that a linear map $ : A — > A2 is a co-Schwarz map (resp., a co-Schwarz contraction) if 
and only if there is a Schwarz map (resp., a Schwarz contraction) <3> : Aj — > A2 such that 
$ = $ o T, where T(X) := X T denotes the transpose of X G A\ with respect to a fixed 
orthonormal basis of "Hi, and ^4^ := {X T : X G A} C B(T-Li). Furthermore, we say that $ 
is co-substochastic (resp., co- stochastic) if $* is a a co-Schwarz contraction (resp., a unital co- 
Schwarz contraction). Theorem 14.31 holds also when $ : A\ — > A2 is a co-substochastic map. 
This follows immediately from Theorem 14.31 and the fact that transpositions leave every f- 



divergences invariant (see (iii) of Corollary I2.5p . Alternatively, this can be proved by replacing 



the operator V defined in (14 .11) with the conjugate-linear map 



V{X):=<&*(§(By ll2 X*)B 1/2 , XeA 2 , (4.23) 



and following the proofs of Lemma 14.21 and Theorem 14.31 with V in place of V. 

Recall that a positive map is called decomposable if it can be written as the sum of a 
completely positive map and a completely positive map composed with a transposition. By 
the above, a similar notion of decomposability is sufficient for the monotonicity of the f- 
divergences. Namely, if a trace-preserving positive map $ : Ai — > A2 is decomposable in the 
sense that it can be written as a convex combination of a stochastic and a co-stochastic map 
then <3> G Ai f(Ai, A2) for any operator convex function / on [0, +00). Example 13.61 provides 
simple examples of trace-preserving positive maps that are decomposable in this sense but 
which are neither stochastic nor co-stochastic. 



5 Equality in the monotonicity 

In this section we analyze the situation where the monotonicity inequality 

S,{*(A)\\*(B)) < S f (A\\B) 

holds with equality, based on the integral representation of operator convex functions that we 
give in Section [HJ Let T be the set of continuous non-linear operator convex functions / on 
[0, +00) that satisfy 

lim f M = o. 

x— >+oo x 



By Corollary 18 .2\ f G J 7 if and only if there exists a positive measure /!/ and a function ipf 
on (0, +00) such that 



f(x) = f(0) + / ty f (t)x + tp t {x)) dfi f (t), (5.1) 

</(0,+oo) 

where (ft is defined in (14.91) . 

Recall that spec(X) denotes the spectrum of an operator X. We will use the notation \H\ 
to denote the cardinality of a set H. Given B G Ai s + and a positive map $ : A\ — > A2, let 
$b : Ai ->■ A 2 and $| : A 2 -> Ai be the maps defined in (jXTJ and (13T2"j) . 
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5.1 Theorem. Let A,B£ be such that suppA < suppi?, let $ : A± — > A2 be a 

substochastic map such that Tr$(i?) = Tr5, and define 



A := L A R B -i 
Then, for the following conditions 



and 



A :=L 



®(A)R${B) 



X 



we have 



(111) 




(iv) 




(v) 




r 







(vi) 




(vii) 








-! f 


-! f 


(viii) 





IX => X 



and if $ is 2-positive then 



x 



) holds as well. 

(i) There exists a stochastic map \& : A2 — > A\ such that 

y($(A))=A, V($(B)) = B. 

(ii) There exists a substochastic map ^ : A% — > Ai such that (15.21) holds. 

(iii) For every operator convex function / on [0, +00), 

S f mA)\MB)) = S f (A\\B)- 

(iv) The equality in (I5.3P holds for some / G J such that 

|supp/i/| > I spec(A) U spec(A)|. 



(5.2) 



(5.3) 



(5.4) 



(v) There exists a T C (0, +00) such that \T\ > | spec(A) U spec(A)| and 

S„($(A)\\$(B)) = S„(A\\B), teT. 

(vi) 5°$* ($(By z $(A) z ) = B~ Z A Z for all z G C. 

(vii) 5°$* (<5>(By a <5>(A) a ) = B~ a A a for some a G (0,2) \ {1}. 

(viii) 5°$* ($( J B)- i '$(A) i *) = B~ U A U for all t G E. 

(ix) 5°$* (log* - (log* $(5))<I>(A) ) = log* A - (log* B)A°. 

(x) = A. 

holds without assuming that suppA < supp5. If $ is n-positive/ 
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111, 



Moreover, 

completely positive then \1/ in (i) can also be assumed to be n-positive/completely positive. 
Proof. The implication 



i) is obvious. Assume that (ii) holds, and let A := &(A), B : = 
$(£). Then Ti A = TrV(A) < Tri = Tr$(A) < Tr A and similarly for 5 and B, which 
yields Trtf(i) = Ti A, Ti^(B) = Ti B and Tr$(A) = TrA, Tr$(fi) = Ti B (note that this 
latter is automatic here, and not necessary to assume from the beginning). Applying Theorem 
I4T31 twice, we get that S f (A\\B) = S f (V(A)\\V(B)) < S f (A\\B) = S f (§(A)\\$(B)) < S f (A\\B) 
for any operator convex function / on [0, +00), proving 
again obvious. 
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The implication 



m, 



iv 



IS 



Note that if A = then S f (A\\B) = /(O)TrS for any function /, and 



x 



hold true 



automatically. Hence, for the rest we will assume that A 7^ and hence also B 7^ 0. 
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Assume that (iv) holds for a function / 6 J, and let 



f(x) = /(0) + 



(0,+oo) 



{ipf{t)x + ip t (x)) dp f{t) 



be the representation given in (15.11) . By the assumption suppA < suppi?, we have 



S f (A\\B) = f(0)TrB 



WfWTcA + S^AWB)) dpf{t). 



(0,+oo) 



By assumption, Tr$(5) = Tr5, and suppA < suppi? yields that also Tr$(A) = Tr/1 (see 
Lemma [3.21) . Hence, 



S f (A\\B)-S f mA)\mB)) 



(0,+oo) 



(S vt iA\\B) - S Vt (<S>(A)\mB))) dp f (t). 



Since the integrand of the above integral is non- negative for all t due to (14. 13j) . the equality 
means that 



in 



iv 



S Vt (HA)\MB)) = S Vt (A\\B) 



for all t G supp /!/. This gives (v) with T := supp/i/. 

Assume now that (v) holds. This means that for every t G T, 

= S Vt (A\\B) - S V MA)\MB)) = ($(B) 1/2 , (V*<p t (A)V - <p t (A))<S>(B)V 2 ) 



HS; 



where we used that V$(B) 1/2 = B 1/2 due to Lemma H~T1 (note that uj{(f t ) = 0, t > 0). By 
(I4.12p this is equivalent to 

V*^ t {A)V<$>{B) l l 2 = Lp t (A)$(B) 1/2 , t G T, 

or equivalently, 

V* [-h + t(A + tA)- 1 ] £ 1/2 = \-I 2 + t(A + thY 1 ] $(£) 1/2 , t g T. 
By (@2D we get 

U*(A + tl^B 1 ' 2 = (A + t/ 2 )^ 1 $(5) 1/2 , t G T. 
Using Lemma [5.21 below and the assumption that |T| > | spec(A) U spec(A)|, we obtain 

V*h(A)B l/2 = h(A)$(B) l/2 (5.5) 

for any function h on spec(A) U spec(A). In particular, 

V*{A + th)^B 1/2 = (A + t/ 2 )- 7 $(5) 1/2 , 7 ,t>0. (5.6) 

Using (15. 6p with 7 = 1 and 7 = 2, we obtain 



IV^A + tl^B^l 



2 RS = ((A + thy^B) 1 / 2 , (A + t/ 2 )- 1 $(B) 1 / 2 ) H s 

= ((A + tJ 2 )- 2 $( J B) 1 / 2 ,$( J B) 1 / 2 ) H s 
= (^(A + t/!)^ 1 / 2 ,^^) 1 / 2 )^ 

= ((A + tJ 1 )" 2 i? 1/2 ,i? 1/2 )Hs 



((A + t/x)-^ 1 / 2 



2 

Ihs 
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Therefore, we have ||^*x||^ g = ||x||^ g for x := (A + tlx) 1 B 1 ^ 2 , and since V is a contraction 
we get < \\VV*x - x\\ 2 us = \\VV*x\\ 2 ls - 2 ||V*x||J s + \\x\\ 2 ns 



VV*x\\ 2 m - \\x\\ 2 ns < 0, 



by which VV*(A + tl^B 1 ' 2 = (A + tl^B 1 / 2 . Substituting flSSJ) with 7 = 1, we finally 
obtain 

^(A + t/a)- 1 ^) 1 / 2 = (A + t/x)- 1 ^/ 2 , t>0, (5.7) 



and using again Lemma 15.21 we get 

Vh(A)&(B) 1/2 = h(A)B 1 / 2 
for any function h on spec(A) U spec(A). By the definition (14.11) of V, this means that 

$* ((h(A)Q(B)V 2 } $(B)- 1/2 ) B 1 ' 2 = h(A)B 1 ' 2 . 

In particular, the choice h{x) := x z ,x > 0, h(0) := 0, yields 

$* 5 1/2 = A z 5 1/2 - z , z6C. (5.8) 

Multiplying from the right with B~ x l 2 and taking the adjoint, we obtain 



vi 



vn 



is obvious. Assume now that 



vn 



vi 



holds, i.e., B~ a A a 



The implication 

B°<S>* ($(B)- a $(A) a ) for some a G (0,2) \ {1}. Multiplying by B and taking the trace, we 
obtain 

S fa (A\\B) = Tr A a B 1 ~ a = Tr5$* ($(fi)- Q $(A) a ) = Tr _Q $(A) a 
= %($(A)||$( J B)), 

where f a (x) := x a , x > 0. Since the support of the representing measure /if a is (0, +00) (see 
Example 18.31) , we see that 



vii 



computation shows that 



IX 



implies 

► B°c 
3renti 
yields 



iv, 



vi 



and 



vm 



is obvious from 



The equivalence of 

the fact that the functions z >->■ (&(B)~ Z §(A) Z ) and z H- B~ Z A Z are both analytic on 

the whole complex plane. Differentiating 



IV 



vm 



at t = 0, we obtain 



IX 



A straightforward 



for f(x) := xlogx, that is, the equality for the 
standard relative entropy (note that the support of the representing measure for xlogx is 
(0, +00) by Example l8.3p . Hence, we have proved that 



>(vii) 




(viii) 




(ix) 


Assume now that 


(vi) 
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IV 



VI 



holds. In particular, the choice z = yields 
B°$* = A 



(5.9) 



(recall that A < B°). Since $ is substochastic, we have §*(Y*Y) > $*(y*)$*(y) > 
$*(Y*)B°$*(Y), and multiplying from both sides by B°, we obtain that ^(Y) := B°$*(Y)B°, 
Y G A2, is a Schwarz contraction. For u t := &(B)~' lt &(A) lt and w t := B~ lt A lt , we have 



u t u* t = $( J B)- Jt $(A) u <l>( J B) Ji , w t w* t = B- U A°B U , t G 
says that B°§*(u t ) = w t , and hence ^(u t ) = w t B° = w t . Thus, 



Note that 



vi 



< Tr B 1 ' 2 {^{u t u* t ) - *(ut)*«)) B 1/2 = Tr B<f>*(u t u*) - Tr Bw t w* 
= Tr $( J B)$( J B)^$(A) $( J B) 1 ' - TiBB- tt A°B it = Tr $(B)$(A)° - Tr BA° 
= Tr5$*($(A)°) - Tr BA° = TiBA - TiBA = 0, 
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where we used ( 15. 9p . Hence, B 1 ^ 2 ^!(u t u* t )B 1 / 2 = B 1 / 2 ^! (u t )^ (u* t )B 1 / 2 , and multiplying from 
both sides with I? -1 / 2 , we obtain ^(u t ul) = ^(u t )^(ul). Since ^(u t ) ^ 0, and \1> is a 
Schwarz contraction, this yields that \\^\\ s = 1 and u t G M.^. Hence, by Lemma 13.91 
-&(u t Y) = ^f(u t )^(Y) = w t $*(Y)B° for all Y G A 2 and t G R, i.e., 

B°$* ($(fl)- a $(A) a Y) B° = B~ it A it ^*(Y)B°, ieKJ e A 2 . 

Note that the maps z ^ 5°$* ($(B)~ Z $(A) Z Y) B° and z m- £-M*$*(Y).B are analytic 
on the whole complex plane and coincide on iR and thus they are equal for every z G C. 
Choosing z = 1/2 and Y : = $(A) 1 / 2 $( J B)~ 1 / 2 , we get 

5°$* ($( J B)- 1 / 2 $(A) 1 / 2 $(A) 1 / 2 $( J B)- 1 / 2 ) 5° = B- 1 / 2 A 1 / 2 $*($(A) 1 / 2 $( J B)- 1 / 2 ) J B° 

= .B- 1 / 2 ^ 1 / 2 ^ 1 / 2 ^- 1 / 2 , 



where we used the adjoint of (vi) with z — 1/2. Multiplying from both sides by I? 1 / 2 , we 
obtain 



x 



Finally, assume that (x) holds, and hence 

*%(Q(A))=A, & B @(B)) = B. 

Note that Q* B is not necessarily trace-preserving, as (Q* B )*(Ii) = $b{Ii) — which 
might be strictly smaller than I 2 . However, if p is a density operator on H\ then the map 
X i — y ®b(X) + (Tr pX){I 2 - $(.B) ) is obviously unital and hence its adjoint \P : A 2 ->■ 
Ax, tf(Y) = $^(y) + [Tr(J 2 -$( J B)°)y]p is trace-preserving. Moreover, = 
and \l/ ($(.£?)) = <& B ($>(B)), as one can easily verify. Since \P is obtained from $* by compos- 
ing it with completely positive maps and adding a completely positive map, it inherits the 
positivity of $*, i.e., if <3>, and hence $*, is n-positive/completely positive then so is In 
particular, if $ is 2-positive then is a unital 2-positive map and hence it is also a Schwarz 



contraction, i.e., ^ is stochastic. Thus (xj=>(i) holds in this case. □ 



5.2 Lemma. If / is a complex-valued function on finitely many points {xj}i G / C [0, +oo) 
then for any pairwise different positive numbers {ti}i^i, there exist complex numbers {cj}j e / 
such that f(xi) = £V , Ot^I - - i € I- 

Proof. The matrix C with entries CV,- := - , z, j G /, is a Cauchy matrix which is invertible 
due to the assumptions that 7^ x,- and tj 7^ ^ for i ^ j. From this the statement follows. □ 

5.3 Corollary. Assume that suppAj < supply, i = 1, . . . ,r, in the setting of Corollary 14.71 
Then equality holds in (I4.22p if and only if 

1/2 / x -^ \ /x-^ \ -1/2 



/'M. = ^ 2 (E, Pi^i) " (E, Pi A i) (E 7 Pi^) " s * ] 2 



Proof. It is immediate from writing out the equality A = <& B (<&(A)) given in (x) in the setting 



of Corollary S3 □ 

5.4 Remark. Note that if suppA < suppi? and Tr$(5) = Tr B then for a linear function 
f(x) = /(0) + ax, the preservation of the /-divergence is automatic, and has no implication 
on the reversibility of $ on {A, B}. Indeed, we have Tr $(A) = Tr A due to Lemma [3.21 and 

Sf($(A)\\$(B)) = /(0)Tr$(B) + aTr$(A) = f(0) Tr B + aTr A = S f (A\\B). 
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in Theorem 15. 1[ we used that / has no quadratic term, 



Note that in the proof of (iv) 
i.e., \im x _, +00 = 0. Of course, the same proof would work if we assumed Sf(Q(A)\\<&(B)) = 
Sf(A\\B) for some continuous operator convex function / : [0, +oo) — > R satisfying (15. 4p and, 
additionally, that Sf 3 (&(A)\\$(B)) = S h (A\\B) for f 2 (x) := x 2 . The following example shows 
that the exclusion of the quadratic function is not just a technicality of the proof in the sense 

of Theorem 15.11 



that the preservation of the /2-divergence is not sufficient for 



x 2 is 



5.5 Example. The /-divergence corresponding to the quadratic function f 2 (x) 
Sf 2 (A\\B) = TyA 2 B~ 1 (when suppA < suppi?). Preservation of the /-divergence by a 
stochastic map is not automatic in this case; however, it is not sufficient for the reversibility 
of the map, either. Indeed, it was shown in Example 2.2 of [27] that there exists a positive 
definite operator D 123 on a tripartite Hilbert space H\®H 2 ® H 3 , such that 



23, 



0Dl2®T 3 )(7i ®D 2 ®T 3 ) 



but 



where r,- 



i 23 {n®D 23 ) lt ^ (D 12 ®r 3 ) 



(n <g> D 2 ® t 3 ) lt for some t e 



(5.10) 



(5.11) 



and D 23 



dim Hi *' 

H:=Hi®H 2 ® Ha, A 



Tr^i D 123: D i2 :— Tr W3 D 123 , D 



w 3 ^i23, ^2 ■= Tr Wl(8W g D 123 . Define 
D 23 . Let A := A 2 := ® W a ) ® 7 3 



3 , /1 .= D123 and 5 := n 
and let $* be the identical embedding of A 2 into Ai. Then, (I5.10p reads as 

Multiplying both sides by A and taking the trace, we obtain 

Tr A 2 B~ X = TvA^(A)^(B)~ 1 . 



(5.12) 



Note that $ is the orthogonal (with respect to the Hilbert-Schmidt inner product) projection 
from Ai onto A 2 , i.e., $ is the conditional expectation onto A 2 with respect to Tr, and 
$(A)$( J B)- 1 G A 2 . Hence, we have Tr A$(A)$( J B)- 1 = Tr $(A) 2 $(5)- 1 . Hence, (KT%\ can 
be rewritten as 

S h (A\\B) = TiA 2 B- 1 = TrQiAfQiB)- 1 = S h ($(A)\\$(B)). 
However, (15. lip tells that 

A it B~ it ^ ($(A) i *$( J B)- i *) for some t E E, 



and hence (viii) in Theorem 15.11 is not satisfied. Since $ is 2-positive (actually, completely 
positive), it means that none of (i)-(x) of Theorem 15.11 are satisfied. 



5.6 Remark. It was shown in [S] that, in the classical setting, preservation of an /-divergence 

of Theorem 15.11 whenever / is strictly 



x 



by $ is equivalent to the reversibility condition 
convex. This shows that the support condition (15. 4p might be too restrictive in general. We 
reformulate the classical case in our setting in Appendix |A], and use the condition for equality 
to give a necessary and sufficient condition for the equality in the operator Holder and inverse 
Holder inequalities. 

5.7 Remark. Theorem 15.11 holds also if we replace $ and \1/ with co-(sub)stochastic maps, 



and change conditions (vi) - (viii) to the following: 
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(vl)j B $*($(A) z (f>(B)- z ) = B~ Z A Z for all z E C. 

B°$*($(i) a $(5)^) = 5- Q A a for some a G (0,2) \ {!}. 



vn 



(viiijl B°$*($(A) ft $(B)- ft ) = 5-**^** for all t E 



In the proof of (v) => (vi)f , the previous equality Vh{K)^(Bf' 2 = h(A)B 1/2 in (15. 5p is re- 
placed with 

Wi(A)$(B) 1/2 = h(A)B 1/2 



due to the conjugate-linearity of V, where V is given in (TC23|) . In the proof of gyiJJ =>f(x)| 
let it 4 := $(A) a $(-B) -ft and w t := B- U A U ; then 

it*u t = $(£)- ft $(A)°$(.B) ft , tu t u£ = B- it A°B~ i \ t G E. 

Using that $ is a co-Schwarz contraction, we have = From the mult- 

plicative domain for a co-Schwarz contraction, we have <&(Yut) = Q(u t )Q(Y) = w t §*(Y)B° 
for all Y G A 2 and tel. The rest of the proof is as before with Y = §(By 1/2 §(A). The 



implication (x) => (i) holds also if we assume $ to be 2-copositive. 



5.8 Remark. Note that the assumption that $ is substochastic guarantees that = $>b is 

a Schwarz map, which is also subunital. However, as Example 13.61 shows, there exist subunital 
Schwarz maps that are not Schwarz contractions, and hence it is not obvious whether $>* B is 
a substochastic map. To avoid this problem, we assumed that $ is 2-positive in the proof 



of (x) => (i) of Theorem 15.11 It is an open question whether this extra condition can be 
dropped and whether <3>s can be shown to be a Schwarz contraction by only assuming that $ 
is substochastic. 

6 Distinguishability measures related to binary state 
discrimination 

Let A C B(H) be a C*-algebra, where H is a finite-dimensional Hilbert space, and let S(A) 
be the state space of A, i.e., S(A) := {A G A + : Tr A = 1} is the set of density operators in 
A. 

6.1 Definition. For A, B G A + , the Chernoff distance C(A\\B) of A and B is defined as 

C(A\\B) := sup {(1 - a)S a (A\\B)} = - min tf> (a\A\\B) , (6.1) 

0<Q<1 0<a<l 

where S^AH-B) is the Renyi relative entropy defined in Example 12.71 and 

ip(a\A\\B) := \ogTi A a B 1 ~ a , a EM.. (6.2) 
For every r G R, we define the Hoeffding distance H r (A\\B) of A and B as 

H r (A\\B):= sup S a (e r A\\B)= sup { + S a (A\\B)\ = sup ~ ar ~ ^ ( a \ A W B ) ^ 

0<a<l 0<a<l I 1 — a J 0<a<l 1 — tt 

(6.3) 
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6.2 Remark. Note that 

H r (A\\B) = sup{-sr - $ (s|A||J3)}, (6.4) 

s>0 

where 

^(s|A||B) := (l + s)ip(s/(l + s)\A\\B) , sg[0,+oo), rj> (s\ A\\B) := +00, s < 0. 

For simplicity, we will use the notation ^(a) = ^(a|A||5) and ip(s) := ^ ( s l^ll-^)- Let 
?/>*(r) := sup sgK {sr — ip{s)} be the polar function, or Legendre-Fenchel transform of ^ [T2] . 
By f)6.4p . H r (p\\a) = ip*(—r), rGl. It is easy to see (by computing its second derivative) that 
ip is convex, and hence so is ip. Furthermore, ip'(s) = ip(s/(l + s)) +ip'(s/(l + s))/(l + s), s G 
(0, +00), and (9 + ^(0) = ^(0) + ip'(0), where <9 + ^(0) is the right derivative of ip at 0. In 
particular, lim s ^ +00 ip'(s) =if;(l). Hence, 



H r (A\\B) = r(-r) 



4{0) = -^(0), -r < ^(0) + ^'(0), 
+00, — r > V'(l)- 



It is easy to see that 

tP(0) = -S (A\\B), and if A > B° then ^'(0) = -S(B\\A), 
^(1) = -5„(5p), and if A < 5° then = S(A\\B). 

Being a polar function, ift* is convex, and hence so is the function r H- iy r (p||<r). Moreover, 
ip is lower semicontinuous and thus the bipolar theorem (see, e.g., Proposition 4.1 in [T2"] ) 
yields that ip is the polar function of its polar ip*. Hence, for every s G [0, +00), we have 

(1 + s)ip ( — - — ) = tjj(s) = sup{sr — V>*(r)} = sup {— rs — tjj*(—r)}. 
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i/)(0)+i/)'(0)<-r<i/)(l) 



Replacing s with a/(l — a), we finally get that for every a G [0, 1), 

- S a (A\\B) = = sup f - ff P (j4||B)l = sup (f^- - j> . 



Ot reM 1^1 — « J _. 0(l)<r<-V>(O)— 0' (0) L 1 — a 

(6.5) 

That is, the Renyi a- relative entropies with parameter a G [0, 1) and the Hoeffding distances 
mutually determine each other. 

If Tr A < 1 then ip{l) = logTr AB° < 0, and hence the optimization is over non-negative 
values of r in the last formula of (I6.5j) . Thus, a t— > S a (A\\B) is monotonic increasing on [0, 1) 
and hence 

H (A\\B) = \imS a (A\\B) =: SM\\B). 

a y*i 

Note that ip* is lower semicontinuous (see, e.g., Proposition 4.1 and Corollary 4.1 in [12]), and 
hence i{)*{0) < liminf^o ^*( — r )- On the other hand, it is obvious from the definition that 
r 1 — y H r (A\\B) = ip*(—r) is monotonic decreasing on R, and hence we finally obtain 

\im H r (A\\B) = Hmf(-r) = $*(0) = H Q (A\\B) = Si(A||£). (6.6) 

Finally, it is easy to verify that 

Si (A || 5) = S(A\\B) if TrA — 1. (6.7) 
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The importance of the above measures comes from the problem of binary state discrim- 
ination, that we briefly describe below. Assume that we have several identical copies of a 
quantum system, and we know that either all of them are in a state described by a density 
operator p, or all of them are in a state described by a density operator a. We assume that 
the system's Hilbert space H is finite-dimensional. Our goal is to give a good guess on the 
true state of the system, based on the outcome of a binary POVM measurement (T, I — T) 
on a fixed number (say n) copies, where T is an operator on %® n satisfying < T < I. If the 
outcome corresponding to T happens then we conclude that the state of the system is p, and 
an error occurs if the true state is a, which has probability (3 n (T) := Trer® n T. Similarly, the 
outcome corresponding to / — T yields the guess a for the true state, and the probability of 
error in this case is a n (T) := Tr p® n (J — T). If, moreover, there are prior probabilities p and 
1 — p assigned to p and a, then the optimal Bayesian error probability is given by 

P n , p := mm {pa n (T) + (1 -p)/3 n (T)} = (1 - \\pp - (1 - p)a\\)/2, 

where the minimum is reached at T = {pp — (1 — p)a > 0}, the spectral projection corre- 
sponding to the positive part of the spectrum of pp — (1 — p)a. For every p £ (0, 1), let 

T ( n v=< f- 1 °gi( 1 -lbp-( 1 -pVlli) = - 1 °g^ o< P <i/2, 
" t-iog -Ibp- (i-pH) = - 1o s rVn, P , V2<p<1- 

The theorem for the quantum Chernoff bound [31 [36] says that, as the number of copies n 
tends to infinity, the error probabilities P n>p decay exponentially, and the rate of the decay is 
given by the Chernoff distance. More formally, 

- lim(l/n)logP n , p = ]im(l/n)T p (p® n \\<T® n ) =C(pH, p£ (0,1). (6.9) 

n— >oo n— >oo 

In the asymmetric setting of the quantum Hoeffding bound, the error probabilities a n are 
required to be exponentially small, and f3 n is optimized under this constraint, i.e., one is 
interested in the quantities 

/V := min{/3 n (T) : a n {T) < e~ m \ T £ B(H® n ), < T < /}, 

where r is some fixed positive number. The theorem for the quantum Hoeffding bound [151 ES] 
says that, for every r > 0, the error probabilities /3 n>T . decay exponentially fast as n goes to 
infinity, and the decay rate is given by the Hoeffding distance with parameter r. Moreover, if 
supp p < supp a, then for every r > we have a real number a r such that 



- lim (l/n)log/3 n , r = lim (l/n) T „ (p® n ||a® n ) = H r {p\\a). (6.10) 

Note that for density operators p and a, ^(a|p||cr) = logTr p a o~ l ~ a < for every a £ [0, 1] 
due to Holder's inequality flA.8j) . Hence, C(p||cr) > 0, and C(p\\a) = if and only if equality 
holds in Holder's inequality, which is equivalent to p = a. Similarly, H r (p\\a) > for every 
r £ WL, and H r (p\\a) = if and only if p = cr, or supp p > supp a and r > S'(crllp). 

6.3 Proposition. Let A,B £ *4.i i+ and let $ : Al — )■ *A 2 be a substochastic map such that 
Tr$(5) = Tr5. Then 

C($(A)\\$(B)) <C(A\\B) and fl" r ($(A)||$(5)) < H r (A\\B), reR. (6.11) 

If there exists a substochastic map * : A 2 Ai such that \l/($(yl)) = A and \l> ($(£?)) = £? 
then the inequalities in (16. lip hold with equality. 
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Proof. By Example WM S a ($(A)\\$(B)) < S a (A\\B) for every a G [0,1), and equality holds 
for every a G [0, 1) if there exists a substochastic map \I> : A% — > A\ such that fy(§(A)) = A 
and \&($(.B)) = B, due to Theorem 15.11 The assertion then follows immediately from the 
definitions fETj) and (KTSp . □ 

Our goal now is to give the converse of the above proposition, i.e., to show that equality 
in the inequalities of (16. lip yields the existence of a substochastic map \1/ : A 2 — > A\ such 
that ty(<&(A)) = A and ^($(5)) = -B. This would be immediate from Theorem 15.11 if the 
Chernoff and the Hoeffding distances could be represented as /-divergences (at least when $ 
is also assumed to be 2-positive). However, no such representation is possible, as is shown in 
the following proposition: 

6.4 Proposition. The Chernoff and the Hoeffding distances cannot be represented as f- 
divergences on the state space of any non-trivial finite-dimensional C*-algebra. 

Proof. Let A C B(Ji) where dimH > 2, and let ei,e2 be orthonormal vectors in % such 
that |e 3 -)(ej| G A, j = 1, 2. Define p := |ei)(ei|, a p := p|e 1 )(ei| + (1 - p)\e 2 )(e 2 \, p G (0, 1). 
One can easily check that C(p\\a p ) = H r (p\\a p ) = — \ogp for every r > 0, while Sf(p\\a p ) = 
pf(l/p) + (1 — p)f(0) for any function / on [0, +00). Hence, if any of the above measures 
can be represented as an /-divergence, then we have pf(l/p) + (1 — p)/(0) = — logp for 
the representing function /, and taking the limit p ~\ yields 00(f) = +00. In particular, 
Sf(a p \\p) = +00 for every p G (0, 1). On the other hand, C(a p \\p) = — \ogp and H r (a p \\p) = 
if r > — log p. That is, C(a p \\p) is finite for every p G (0, 1) and for every r > there exists a 
p G (0, 1) such that H r (a p \\p) is finite. □ 



Note, however, that for the applications of Theorems 14.31 and 15. 1[ it is sufficient to have 
a more general representability. Indeed, let A be a finite-dimensional C*-algebra and D : 
S(A) x S(A) —> R. We say that D is a monotone function of an f -divergence on the state 
space of A if there exists an operator convex function / : [0, +00) — > R and a strictly 
monotonic increasing function g : {Sf(p\\cr) : p, a G S(A)} — )■ WL U {±00} such that 

D(p\\v)=g{S f (j>\\a)), p,aeS(A). 

Obviously, if D is a monotone function of an /-divergence then it is monotonic non-increasing 
under stochastic maps due to Theorem 14.31 Moreover, if D (3>(p) || $(c)) = D (p\\ a) for some 
stochastic map $ and p, a G S(A) such that supp p < supp a, and the representing function 
/ satisfies / G T and |supp/i/| > | spec(L p _R cr -i) U spec(L$(p)i?$( -)-i)| then $*($(p)) = p, 
due to (iv) of Theorem 15.11 For instance, the Renyi a-relative entropy is a monotone function 
of the /^-divergence with g[x) := ^-j-logsgn(a — l)x, for every a G [0,2] \ {1}. However, 
the same argument as in Proposition 16 .41 yields that none of the Renyi relative entropies with 
parameter a G (0, 1) can be represented as /-divergences. 

6.5 Proposition. For any r G (0, +00) and any non-trivial C*-algebra A, the Hoeffding 
distance H r cannot be represented on the state space of A as a monotone function of an 
/-divergence with with a continuous operator convex function / G J 7 such that | supp pf \ > 6. 

Proof. Let A C B(T-L) be a C*-algebra and let ei,e2 be orthogonal vectors in % such that 
|ei)(ei|, \e-ii {e 2 \ G A. Choose p, q G (0, 1) such that p 7^ q and glog ^ + (1 — q) log jE^ < r, and 
define p := p|ei)(ei| + (1 - p)|e 2 )(e 2 | and a := g|ei)(ei| + (1 - g)|e 2 )(e 2 |. Then ^(0|p||cx) = 
and -ifj(0\p\\a) - ifj'(0\p\\a) = S(a\\p) = q\og J + (1 - q) log jfj < r, and hence H r (p\\a) = 
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-ip(0\p\\o-) = 0. Define $ : A -¥ A, $(X) := (Tr X)//(dim%). Then $ is completely 
positive and trace-preserving, $(p) = $(cr), and hence H r (&(p)\\<&(a)) = = iJ r (p||cr). Note 
that | spec (L p R a ~i) \ < 5 and | spec (Lq^R^^-i) \ — 1. If we had iJ r (p||cr) = g (S/(p||cr)) and 
iJ r ($(p) ||$(<t)) = g (^(^(p) ||$(cx))) for some strictly monotone g and continuous operator 
convex / G J 7 such that | supp p/| > 6 then Theorem 15.11 would yield $* ($(p)) = p. However, 
$(p) = and hence $*($(p)) = $*($(<t)) = a ^ p. □ 

The above proposition also shows that the preservation of a Hoeffding distance of a pair 
(p, a) by a stochastic map for a given parameter r might not be sufficient for the reversibility 
of $ on {p, a} in the sense of Theorem 15. lj the reason for this in the above proof is that the 
Hoeffding distance might be equal to zero even for non-equal states. The Chernoff distance, 
on the other hand, is always strictly positive for unequal states; yet the following example 
shows that the preservation of the Chernoff distance is not sufficient for reversibility in general, 
either. 

6.6 Example. Let "H := C 3 and let A be the commutative C*-algebra of operators on 
% that are diagonal in some fixed basis ei,e2,e3. Let p := (2/3) |ei) (ei| + (l/3)|e2)(e2|, 
o ■ = (l/6)|ei>(e 1 | + (l/3)|e 2 )(e 2 | + (1/2) ]e 3 > <e 3 | , and define $ : A ->■ A as 

$(| ei )( ei |) := $(|e 2 )(e 2 |) := | ei )(ei|, $(|e 3 )<e 3 |) := |e 3 )(e 3 |. 

Then $ is completely positive and trace-preserving, and we have 3>(p) = |ei)(ei|, $(cr) = 
(l/2)|e 1 )(ei| + (l/2)|e 3 )(e 3 |. For every a G R, we have Tr p a a l ~ a = ^f- and Tr $(p) Q $(a) 1 - a = 
2"" 1 , and hence 

C($(p)||$(a)) = -log^ (0Mp)\Ma)) = S {Q(p)\\*{(r)) = log2 = S (p\\a) 
= -]og1>(0\p\\a) = C(p\\a). 

On the other hand, it is easy to see that $*($(p)) = (1/3) |ei) (ei| + (2/3)|e 2 )(e 2 | ^ p, and 



therefore (x) of Theorem 15.11 does not hold, and hence $ is not reversible on the pair {p, a}. 



6.7 Remark. Note that in the setting of Theorem l5.ll if $ is 2-positive and S a (&(A) || $(£>)) = 
S a (A\\B) for some a G (0, 1) then $^($(yl)) = A, i.e., the preservation of a Renyi a-relative 
entropy with some a G (0,1) is sufficient for the reversibility of $ on {A,B}. The above 
example shows that the same is not true for the 0-relative entropy. 

6.8 Corollary. Let A be a C*-algebra of dimension at least 3. Then the Chernoff distance 
cannot be represented on its state space as a monotone function of an /-divergence with an 
/ G T such that | suppp/| > 6. 



Proof. Immediate from Example 16.61 □ 

After the above preparation, we are ready to prove the analogue of Theorem 15.11 for the 
preservation of the Chernoff and the Hoeffding distances. The preservation of the Chernoff 
distance was already treated in the proof of Theorem 6 in [23] in the case where both operators 
are invertible density operators and the substochastic map is the trace-preserving conditional 
expectation onto a subalgebra. We use essentially the same proof to treat the general case 
below. 

6.9 Theorem. Let A, B G A\,+ be such that supp A < supp B, let $ : A\ — > A2 be a 



substochastic map such that Tr $(.£?) = Tr B, and assume that (i) or (ii) below holds: 
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(i) C(*(A)\\*(B)) + S ($(A)\\$(B)), C($(A)\\$(B)) ± S^{B)\\${A)) , and 

CmA)\mB)) = C(A\\B). 

(ii) For some r £ (-ip (l\$(A)\\$(B)) , -ip (0\$(A)\\$(B)) - 0'(O|$(A) ||$(5)), 

ff r ($(A)||$(S)) = J ff r (A||5). (6.12) 

Then $^($(A)) = A, and if <3> is 2-positive then there exists a stochastic map \1/ : A2 — > A\ 
such that = A and ^($(5)) = 5. 



Proof. Assume first that[(i)1holds. Due to the assumptions C(<S>(A)\\$(B)) ^ S ($(A)\\$(B)) = 
-iP (0|$(A)||$(P)), C($(A)\mB)) ^ Sq(${B)\\$(A)) = -ip (1|$(A)||$(P)), and the defini- 
tion (16.11) of the Chernoff distance, there exists an a* £ (0,1) such that C($(A)\\$(B)) = 
—ip (a*\§(A) \\$>(B)). Using the monotonicity relation (I4.17p . we get 

C($(A)||$(£)) = -logTr^^) ^^) 1 " * < -logTrA *^ 1 - * < C{A\\B) = C{$(A)\\$(B)). 



Hence, Tr $(A) a *$(E) 1 - Q * = Ti A a * B l ~ a \ which yields = A due to[(iv)]of Theo- 

rem EIJ 

Assume next that fl6U2l holds for some r £ (--0 (1|$(A)||$(B)) , -ip (0|$(A)||$(S)) - 
^ / (0|$(-4)||$(5))- Then there exists an s* £ (0, +00) such that H r ($(A)\\$(B)) = -s*r - 
4>(s*\<f>(A)\\<S>(B)) (see RemarkE2} • Thus, H r ($(A)\\$(B)) = -a*r/(l-a*)+S a *($(A)\\$(B)), 
where a* := j^— - £ (0, 1). Using the monotonicity (I4.2ip . we obtain 

H r ($(A)\\<S>(B)) = -aV/(l - a*) + S a .($(A)||$(B)) 

< _ a V/(l - a*) + S a .(A||fl) < H r (A\\B) = H r ($(A)\\$(B)). 



Hence, Tr $(A) a >(5) 1 " a * = Tr A a * B 1 " 01 * , which yields = A due to givj] of Theo- 

rem [5TTJ 

Finally, if $ is 2-positive then <&g(<&(A)) = A yields the existence of \P in the last assertion 



the same way as in the proof of (x) => (i) in Theorem 15.11 □ 

6.10 Corollary. Assume in the setting of Theorem 16.91 that suppA = suppP and Tr A = 
TrB. If C($(A)||$(£)) = C(A\\B) then = A. 

Proof. Let V( a ) := ^ ( a l^ ) ( y 4)ll ( ^ ) (-S)) > a £ By the assumptions, we have supp$(A) = 
supp$(P) and Tr$(A) = Tr$(P), and hence ip(0) = ip(l)- Since ip is convex, there are 
two possibilities: either ip is constant, or the minimum of ip on [0, 1] is attained at some 
a* £ (0,1). In the latter case we have C($(A)\\$(B)) ^ S ($(A)\\$(B)), C($(A) ||$(P)) ^ 
So(<&(B)\\<$>(A)), and hence the assertion follows due to Theorem 16.91 If ip is constant then 
we have Tr ^(A) a ^(By- a = = = Tr$(A) = (Tr $(A)) Q (Tr ^(B)) 1 " for every 

a £ [0, 1], and the equality case in Holder's inequality yields that $(A) is constant multiple of 
$(B) (see Corollary \KJ%. Since Tr$(A) = Tr $(5), this yields that = $(5). Similarly, 

- min ip(a\A\\B) = C{A\\B) = C($(A) ||$(P)) = - logTr = -logTrA = -ip (0\A\\B) , 

0<a<l 

and since Tr A = TyB, we also have — logTrA = — logTrP = — -0 (1|A||.B). Hence, a t— > 
ip (a\A\\B) is constant on [0, 1], and the same argument as above yields that A = B. Therefore, 
$^($(A)) = = B = A. □ 

6.11 Remark. Note that the interval (-ip (1|$(A)||$(£)) , (0|$(A) ||$(£))-^'(0|$(A) \\$(B)) 
of Theorem 16.91 might be empty; this happens if and only if a 1— > ip (a\<$>(A)\\<&(B)) is 
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constant. A characterization of this situation was given in Lemma 3.2 of [22]. 
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7 Error correction 



Noise in quantum mechanics is usually modeled by completely positive trace non-increasing 
maps. The aim of error correction is, given a noise operation $, to identify a subset C of the 
state space (called the code) and a quantum operation \l/ such that it reverses the action of the 
noise on the code, i.e., \l/($(p)) = p, p G C. It was first noticed in [12] that the preservation of 
certain distinguishability measures of two states by the noise operation is a sufficient condition 
for correctability of the noise on those two states. This result was later extended to general 
families of states in [2U [22]- The measures considered in these papers were the Renyi relative 
entropies and the standard relative entropy. Recently, the same problem was considered in 
|] using the measures T p given in (16.81) . and similar results were found, although only under 
some extra technical conditions. Below we summarize these results and extend them to a 
wide class of measures, based on Theorem 15.11 

Let Ai be a C*-algebra on "Hj for i = 1, 2, and let S(Ai) denote the set of density operators 
in Ai. For a non-empty set C C S(A\), let coC denote the closed convex hull of C, and let 
suppC be the supremum of the supports of all states in C. Note that there exists a state 
a G coC such that supper = suppC. We introduce the notation d 2 := (dim Hi) 2 + (dim?^) 2 . 
Note that if X G Ai and $ : Ai — > A2 is a trace non-increasing positive map then 

||$(X)||i = max{Tr$(X)^ : S G A 2 self-adjoint, -J 2 < S < h] 
= max{TrX$*(£) : S G A2 self-adjoint, -J 2 < S < h} 
< nmx{TrXR : R G Ai self-adjoint, ~h < R < h} = \\X\\ U 

which in particular yields that the measures T p are monotonic non-increasing under sub- 
stochastic maps. 

7.1 Theorem. Let $ : A\ — > A2 be a trace-preserving 2-positive map, and let C C S(A±) be 
a non-empty set of states. The following are equivalent: 

(i) There exists a stochastic map ^ : A2 — > A\ such that for every p G coC, 

9@(p))=p. (7.1) 

(ii) For every operator convex function / on [0, +00), and every p, a G coC, 

S f (Q(p)\\${*)) = S f (p\\a). (7.2) 

(iii) The equality (17. 2ft holds for every p G C and for some a G S(Ai) such that supper > 
suppC, and some / G T such that | supp/i/| > d 2 . 

(iv) S Vt (§(p) || ^(er)) = S Vt (p\\(j) for every p EC and for some a G S(Ai) such that suppa > 
suppC, and a set T of fs such that |T| > d 2 . 

(v) For every p, o G coC and every r G M, 

H r (<S>(p)\ma)) = H r (p\\a). (7.3) 

(vi) The equality in (17. 3p holds for every p EC and for some a G S(A±) such that supper > 
suppC, and for every r G (0, 5) for some 5 > 0. 
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(vii) For every p 6 coC and every cr G coC such that supper = suppC, 



(7.4) 



(viii) The equality ( I7.4p holds for every p G C and some a G iS(.4i). 

(ix) There exist decompositions suppC = 0£ =1 'Hi,k,L®'Hi,k,R and supp $(C) = 0£ =1 %2,/c,l® 
^2,k,R, invertible density operators Uk on T-L\^,r and u k on T-L2,k,R, and unitaries £4 : 
*Hi,k,L - > 7~t-2,k,L, k = 1, . . . , r, such that every p E C can be written in the form 

r 

P = Q)PkPk,L ® Wfc 
fc=l 

with some density operators p^z, on %i t k,L and probability distribution {pfc}£ =1 , and 
$(A <g> w fc ) = C/feA^* ® w fcj AG B(H 1A l)- 



m 



IV 



vm 



hold also 



Moreover, if $ is n-positive/completely positive then ^ in (i) can also be chosen to be 
n-positive/completely positive. The implications 
if we only assume $ to be substochastic. 
Furthermore, criterion 



x 



if $ is completely positive. 



below is sufficient for 



vm 



to hold, and it is also necessary 



(x) For every p G C, every p G (0, 1), every n G N, and for some a G S(A±) such that 
supp cr > supp C, 

T p ($® ra (p® n ) 1 1 $^( a ® n )) = T p (p m 1 1 a®") . (7.5) 



Proof. The implications 



(ii) =>■ (iii) => (iv) => (viii) follow immediately from Theorem 



tl 


(iii) 


— > 


(iv) 


, T 


(viii) 


holds then 



can be chosen to be supp «/, and hence it is independent of the pair (p, cr)). If 
p = $*($(p)) = or 1/2 $* (<3>(cr)~ 1/2 $(p)$(er) _1/2 ) (T 1/2 implies that supp p < supper for every 
p G coC, and hence $* can be completed to a map \l/ as required in 
) in Theorem 15. II This proves 



proof of 



x 



vm 



p EcoC and cr G coC such that supp cr 
Theorem 15.11 yields f)7.4p for this pair (p, cr), proving 
is obvious. 

The implication 



the same way as in the 
holds. Fixing any 
suppC, we have \l/($(p)) = p and \I/($(cr)) = cr, and 



Assume that 



Mvu 



The implication 



vn 



vm 



VI 



VI 



follows by Proposition 16. 3[ and the implication 
holds. Then, by (ESD and (J577)) . we have S(<$>(A)\\<$>(B)) 



is 



obvious. Assume now that 
yS(v4 1| , i.e., the equality holds for the standard relative entropy, which is the /-divergence 
corresponding to f(x) = xlogx. Since the support of the representing measure for xlogx 
is (0, 



-oo), this yields 



in. 



The implication 



(x) 




(vi) 





follows from ( I6.10p . Assume that 



holds. Then we can assume \& to be completely positive, 



<£> is completely positive and 
and hence and \E f0n are positive and trace-preserving for every n G N. Thus, by the 
monotonicity of the measures T p , T p (p® n || cr®") = T p || ^"(fc®"^®"))) < 

T p ($®"(p® n ) || $®"(cr® n )) < T p (p® n || cr 0n ), and hence gg] holds. 
Finally 



vn 



ix 



follows due to Lemma I3.11[ and 



ix 



vn 



forward computation. 



is a matter of straight- 

□ 
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Briefly, the above theorem tells that if the noise doesn't decrease some suitable measure 
of the pairwise distinguishability on a set of states then its action can be reversed on that set 
with some other quantum operation; moreover, the reversion operation can be constructed by 
using the noise operation and any state with maximal support. There are apparent differences 
between the conditions given above; indeed, (iii) tells that the preservation of one single f- 



divergence is sufficient, while (iv) requires the preservation of sufficiently (but finitely) many 
/-divergences, (v) requires the preservation of a continuum number of measures, and 

and 



requires even more. The equivalence between 



m 



iv. 



x 



is easy to understand; as we have 



seen in the proof of Theorem 15. 1\ as far as monotonicity and equality in the monotonicity 
are considered, any /-divergence with / G J 7 is equivalent to the collection of (^-divergences 
with t G supp/i/, and the condition on the cardinality of supp/i/ is imposed so that any 
function on the joint spectrum of the relative modular operators can be decomposed as a 
linear combination of ipt's, which in turn is needed to construct the inversion map The 
main open question here is whether this support condition is really necessary, or already the 
preservation of S ipt for one single t would yield the reversibility of the noise, as is the case for 
classical systems (i.e., commutative algebras); see Remark 5.5 and Appendix [XI 
Note that 



in 



tells in particular that the preservation of the pairwise Renyi relative en- 
tropies for one single parameter value a G (0, 2) is sufficient for reversibility. This is in contrast 
where the preservation of continuum many Hoeffding distances are required, despite 



VI 



with 

the symmetry suggested by (16. 3 p and (16. 5p . On the other hand, we have the following: 



7.2 Proposition. In the setting of Theorem 17. 1[ assume that there exists a Co C S(A\) such 
that co Co = coC, and a a G S(A\) such that supper > suppC, and the following hold: 



< m := inf {-^(0|$(p)||$(ct)) -^'(0|$ 

peCo 



and for some r G (0, 



H r {*(p)\\*(tT)) = H r (p\\tr) 



P G C . 



Then $*($(/))) = p for every p G coC. 
Proof. Immediate from Theorem 16.91 



□ 



Finally, if all the states in C have the same support then some of the conditions in Theorem 
17.11 and Proposition 17.21 can be simplified, and we can give a simple condition in terms of 
preservation of the Chernoff distance: 

7.3 Proposition. Let $ : A\ — > A% be a trace-preserving 2-positive map and let C C S(Ai) 
be a non-empty set of states such that supp p = supp C for every p G C. Assume that there 
exists a a G S(A\) such that supper = suppC and one of the following holds: 



(i) There exists apG (0, 1) such that 



T p ($® n (p® n ) 1 1 $^(cr ")) = T p {p® n 1 1 a m ) , p G C, neN. 



(7.6) 



For every p G C, 



C($(p)\Ma)) = C(p\\a) 
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(iii) There exists a C such that coC = coC and an r £ (0,inf peCo S($(cr)||$(p))) such that 
for every p £ Co, 

tf r (*(p)||*(<7)) = ff r (p||a). (7.7) 



Then 



C($(p))=P, p£coC. 



(7.8) 



Proof. The implication (i) 



lary 16.101 Assume now that 



li 



m 



is immediate from (I6.9p . and (ii) implies (I7.8P due to Corol- 



holds. Since supp p = supper, p £ Co, we have ip (0|$(p) ||$(cr)) 
and -ip'(Q\$(p)\\$(a)) = S($(<r)\\$(p)) 7 p £ C . Hence, (EI]) yields (ESD due to Proposition 
OJ □ 



Note that the conditions (17. 5p and (17. 6p are very different from the others, as they require 
the preservation of some measure for arbitrary tensor powers. These conditions could be 
simplified if the trace-norm distance could be represented as an /-divergence. Note that this 
is possible in the classical case; indeed, if p and q are probability density functions on some 
finite set X, and f(x) := \x — 1|, x £ R, then 

Sf(p\\q) = ^q(x)\p(x)/q(x) - 1| = ^b(ac) ~q(x)\ = \\p-q\h- 

xeX x£X 

Note, however, that the above / is not operator convex, and hence the proof given in Theorem 
15.11 wouldn't work for it. Even worse, the trace-norm distance cannot be represented as an 
/-divergence, as we show below by a simple argument. 

7.4 Corollary. If the observable algebra of a quantum system is non-commutative then the 
trace-norm distance on its state space cannot be represented as an /-divergence. 

Proof. Assume that A C B(7i) is non-commutative; then we can find orthonormal vectors 
ei,e2 £ H such that | ) ( e ^ | £ A, i — 1,2. Assume that the trace-norm distance can be 
represented as an /-divergence. Then, for every s £ [0, 1] and t £ (0, 1), when p := s|ei)(ei| + 
(1 — s) |e2) (e2 1 and a := £|ei)(ei| + (1 — £)|e2)(e2|, we have 

tf(s/t) + (1 - t)/((l - s )/(l - t)) = S f (p\\a) = \\p -a\\ 1 = 2\s - t\. 

Letting s = t gives /(l) = 0. Letting t \ gives su(f) + /(l - s) = 2s for all s £ (0, 1]. This 
implies that oj(f) is finite and co(f) + /(0) = 2. Now let p := |ei)(ei| and a := |^)(^|, where 
ip := (ei + e 2 )/v / 2- Then ||p — er||i = \/2, while by (2.6) one can easily compute 

Sfiph) = \f{l) + \u{f) + 1/(0) = \{u{f) + /(0)) = 1. □ 

7.5 Remark. A similar argument as above can be used to show that for any p £ (0, 1), the 
measure D p (p\\cr) := 1 — ||pp — (1 — p)^^ cannot be represented as an /-divergence on the 
state space of any non- commutative finite-dimensional C*-algebra. 

7.6 Remark. In general, a function on pairs of classical probability distributions might have 
several different extensions to quantum states. A function that can be represented as an /- 
divergence has an extension given by the corresponding quantum /-divergence. It is not clear 
whether this extension has any operational significance in the case of f(x) := \x — 1|. 
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While the impossibility to represent the trace-norm distance as an /-divergence shows that 



the approach followed in Theorem 17.11 cannot be used to simplify the condition in (x) of the 
theorem, other approaches might lead to better results. Indeed, the results of the recent paper 
[6J can be reformulated in the following way: 

7.7 Theorem. Let C C S(Ai) be a convex set of states and let $ : A± — > A2 be a completely 
positive trace-preserving map such that 

t p ($(p)\\${<t)) = t p (p\\<t), P e(o,i). 

Then the fixed-point set of $po$ is a C*-subalgebra of PA\P, where P is the projection onto 
supp C, and the trace-preserving conditional expectation V from PA±P onto ker (id — $p o $) 
is Tp-preserving for all p G (0, 1). If, moreover, the restriction of V onto C is surjective onto 
the state space of ker (id— $p o $) then (i)-(x) of Theorem 17.11 hold. 

Note that the continuum many conditions requiring the preservation of T p for all p G 
(0, 1) in Theorem 17.71 can be simplified to a single condition, requiring that $ is trace-norm 
preserving on the real subspace generated by C. Note also that the surjectivity condition is 
sufficient but obviously not necessary. It is, however, an open question whether it can be 
completely removed. In the approach followed in [B], it is important that one starts with a 
convex set of states. The same problem was studied in [23] in a different setting, and the 
following has been shown: 

7.8 Theorem. Let p, er G S(A) be invertible density operators and $ be the trace-preserving 
conditional expectation onto a subalgebra Ao of A. Assume that T p ($(p) 1 1 3>(c)) = T p (p \ \ a) 
for every p G (0, 1), and Aq is commutative or p and a commute. Then $* ($(p)) = p and 

7.9 Remark. In [23] the condition T p ($(p) || $(<r)) = T p (p 1 1 a) , p G (0, 1), was called 2- 
sufficiency, and (17.61) was called (2, n)-sufficiency. It was also shown in Theorem 6 of [23J that 
in the setting of Theorem I7.8[ (I7.6P is sufficient for the conclusion of Theorem 17.81 to hold. 



8 An integral representation for operator convex func- 
tions 

Operator monotone and operator convex functions play an important role in quantum infor- 
mation theory [33]. Several ways are known to decompose them as integrals of some families 
of functions of simpler forms [U [19]. Here we present a representation that is well-suited for 
our analysis of /-divergences, and seems to be a new result. 

8.1 Theorem. A continuous real-valued function / on [0, +00) is operator convex if and only 
if there exist a real number a, a non-negative number b, and a non-negative measure p, on 
(0,+oo), satisfying 

i(0,+oo) (1 + W 

such that 

f(x) = /(0) + ax + bx 2 + / - -^-] d[i(t), x G [0, +00). (8.2) 

^(0,+oo) V 1 + 1 X + tJ 



37 



Moreover, the numbers a, 5, and the measure fi are uniquely determined by /, and 

b= lim a = f(l) - /(0) - b. 

Proof. Obviously, if / admits an integral representation as in (18. 2 p then / is operator convex, 
and 

/(l) = /(0) + a + 6, b= lim M 

where the latter follows by the Lebesgue dominated convergence theorem, using (18. ip and 
that, for x > 1, 

1 / x x \ x— 1 2x 2 

0<-r : : =-, . WJ < 



x 2 vi + ^ z + v x(x + t)(i + t) ~ x(i + t)(i + t) (i + t) 2 ' 

Hence what is left to prove is that any operator convex function admits a representation as 
in (18. 2p . and that the measure \x is uniquely determined by /. 

Assume now that / is an operator convex function on [0, +oo). Then, by Kraus' theorem 
(see (28] or Corollary 2.7.8 in [H]), the function 

g{x) := x E [0, +oo) \ {1}, ^(1) := f'(l), 

x — 1 

is an operator monotone function on (0, +oo). Therefore, it admits an integral representation 

r x (i jf- 1) 

g (x)=a' + bx+ I — -dm(t), ie[0,+oo), (8.3) 

i(0,+oo) X + t 

where m is a positive finite measure on (0, +oo), and 

a' = y(0) = /(l)-/(0), 0<b= lim ^ = lim M 

x— >+oo X x—>+oo x 

(see Theorem 2.7.11 in [19] or pp. 144-145 in |4]). Here, the measure m, as well as a',b, are 
unique and 

m((0, +oo)) = g(l) -a'-b = /'(l) - /(l) + /(0) - 6. 

Thus, we have 

f(x) = f(l) + g(x)(x-l) 

= /(l) + (/(l) - /(0))(s - 1) + bx(x - 1) + / X(X = ^ + * } dm(t) 

</(0,+oo) X + t 

= /(0) + (/(l) - /(0) - 6)a! + 6* 2 + / f -4- - -4-^ (1 + t) 2 dm(t) 



(0,+oo) 



1+t X+t 



/(0) + ax + bx 2 + I ( — ) d(jL(t) 



(0,+oo) 



1+t X+t 



where we have defined a := /(l) — /(0) — 6 and d/j,(t) := (1 +t) 2 dm(t). Finiteness of m yields 
that fi satisfies (18. ip . 
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Finally, to see the uniqueness of the measure /i, assume that / admits an integral represen- 
tation as in f)8.2p . Then, / is operator convex, and hence the function g on [0, +00), defined 
as 

n , f x(l+t) da(t) , ^ 

is operator monotone. Therefore, it admits an integral representation as in (18. 3p . and the 
uniqeness of the parameters of that representation yields that dfi(t) = (1 + t) 2 dm(t). Hence, 
the measure \x is uniquely determined by /. □ 

8.2 Corollary. Assume that / is a continuous operator convex function on [0, +00) that is 
not a polynomial. Then it can be written in the form 



f(x) = /(o) + bx 2 + / U{t)x - -^-] dp®, x e [0, +00) 

</(0,+oo) V X + tJ 



5.5) 



where b = linx r _ >+00 f(x)/x 2 > 0, and \i is a non-negative measure on (0, +00). Moreover, we 
can choose 

,u\ 1 /(i)-/(0)-6 1 

m := T+7 + /'(i)-/(i) + /(0)- 6 ' (TTty 5 ' (8J) 

and if b = and /' (1) > then ^(t) > 0, t e (0, +00). 

Proof. Since / is operator convex, it can be written in the form (18. 2p due to Theorem 18. II Since 
/ is not a polynomial, we have m((0, +00)) > 0, where dm(t) := dfi(t)/(l + t 2 ). Moreover, by 
(B , /' (1) = g(l) = a + 2b + m((0, +00)), from which m((0, +00)) = /'(l) - a - 26. Using 
that a = /(l) — /(0) — 6, we finally obtain 



m((o, +00)) y (0i+oo) v ' /' (1) - /(i) + /(o) - & 7 {0 , + oo) (i + ty 

Substituting it into (E2D, we obtain (JHISD with ^ as in (JHISD - Note that (1 + t) 2 tp(t) = 
1 + * + ^ 1 + ^+5557- Hence ' if & = and < = « + 2& + m((0, +00)) = 

a + m((0, +00)) then tp{t) > 0, proving the last assertion. □ 

8.3 Example. 

(i) f{x) := xlogx admits the integral representation 

xlogx = / ( — — ) dt. 



(0,+oo) 



t X + t 



(/(0) = a = b = and fi is the Lebesgue measure in (I8.2p .) 
(ii) f(x) := —x a (0 < a < 1) admits the integral representation (see [H Exercise V.1.10]) 

_^sincW (_ to. 



* i(0,+oo)V X + t 

(/(0) = 6 = 0, d/i(t) = ^f^t^dt, and ^ = in 013]).) Using that sta J. +qo) ^1 dt 
x, we have 

sin air f fx x 
-x a = -x + / I t a ' x dt. 

* i(0,+oo)V 1 + t X + l 

(/(0) = b = 0, a = -1, and d/i(t) = s ™^t a ~ l dt in (Q.) 
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(iii) By the previous point, f(x) := x a (1 < a < 2) admits the representation 

sin(a — 1)tt f x 2 t a ~ 2 , sin(a — 1W f fx x \ „ , 
x a = — — / dt= — — / U^dt 

* i(0,+°o) X + t 71 </(0,+oc)U x + tj 

(/(0) = 6 = 0, x/j(t) = 1/t, and d/i(t) = sin(a ~ 1)7r r ! - 1 rft in (jg^jl.) Using that 

sin(a — l)ir f fx x \ „ , sin(a — l)7r /" xi Q_2 , 

k a_1 dt = — ^ '— dt = x, 



71 J(0,+oo)\t 1 + tJ 7T i(0,+oo) 1 



sin(a - l)7r [fx x , 2 



we also obtain 

x Q = x + ^^ =^ i I — — It"" 1 ^. 

7(0,+oo) V 1 + t x + tj 

(/(0) = 0, a = 1, b = and dfi{t) = si "(<*-i)^ a-i dt in (IOU 

Note that the function ^ in (I8.5P is not unique. For instance, if /i is finitely supported on 
a set {tx, . . . , t r } then only the sum YH=i ^(tr) is determined by / while the individual values 
ip(ti), . . . , i/}(t r ) are not. 

Note also that in general, f, Q + , ^ dfi(t) might not be finite and hence the term f, Q +oo - ) ^ d t 

cannot be merged with ax in (18.21) . Similarly, the integral fr 0+oo \ ip(t) dfi(t) might be infinite 
and hence it might not be possible to separate it as a linear term in the representation (18.51) 
of /. This is clear, for instance, from (i) of Example 18.31 We have the following: 

8.4 Proposition. For a continuous real- valued function / on [0, +oo) the following are equiv- 
alent: 

(i) / is operator convex on [0, +oo) with \im x ^ +OQ f(x)/x < +oo; 

(ii) there exist anaeR and a positive measure /i on (0, +oo), satisfying 

dfi(t) 



(0,+oq) 1 + 1 



< +oo, (8.7) 



such that 



f(x) = /(0) + ax - / -^—dfi(t), xe[0,+oo). (8.8) 

</(0,+oo) x + t 



Proof. First, note that if / is convex on [0, +oo) as a numerical function, then linXj-^+oo f(x)/x 
exists in (— oo, +00] . In fact, by convexity, (f(x) — f(l))/(x — 1) is non- decreasing for x > 1, 
so that 

lim ZM = lim /w-/w 

x x^+oo x — 1 

exists in (—00, +00]. Also, note that condition (18. 7p is necessary for /(l) to be defined in 
(18.81) . and also sufficient to define f(x) by (I8.8P for all x G [0, +00). 

(i) =r> (ii). By assumption, / is an operator convex function on [0, +00) such that liHL,,_> +00 /(a 
is finite, hence lmx r _ Sl+00 f(x)/x 2 = 0. By Theorem 18.11 we have 

f(x) = f(0) + ax + / (-^—-^—)dfi(t), xG[0,+oo), 

J(0,+oo) V 1 + 1 X + tj 
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where oeR and /x is a positive measure on (0, +00). We write 



x x J {0t+oo) \l + t x + t 

Since 

< /* as 1 < 1 / +00, 

l + t x + t x l + t x 

the monotone convergence theorem yields that 

r /(*) , f d/i(t) 
urn = a 



which implies ( 18. 7p and 



X J(0,+oo) 1 + t 



dn(t)\ 
-r— )■ 



/(*) = /(0) + I a + / TZl) x - ~X7^(t). 

V J(0,+oo) 1 + r / J(0,+oo) 3; + r 

Hence / admits a representation of the form (]8.8j) . 

(ii) =>- (i). It is obvious that / given in (I8.8P is operator convex on [0, +00). Since ^ < 
for all x > 1 and all t G [0, +00), the Lebesgue convergence theorem yields that 

lim / *w=o 

'(0,+oo) X + t 



x— >+oo 



and so 

+ a — / a as x — > +00. 



X X J(o,+oo) x + t 

Hence (i) follows. □ 

8.5 Remark. Note that the condition \im x ^ +OQ f (x) / x < +00 puts a strong restriction on 
an operator convex function /. Important examples for which it is not satisfied include 
f(x) = 2 log a; and f(x) = x a for a G (1,2]. 



9 Closing remarks 

Quantum /-divergences are a quantum generalization of classical /-divergences, which class 
in the classical case contains most of the distinguishability measures that are relevant to clas- 
sical statistics. Although our Corollary 17.41 shows that /-divergences are less universal in the 
quantum case, they still provide a very efficient tool to obtain monotonicity and convexity 
properties of several distinguishability measures that are relevant to quantum statistics, in- 
cluding the relative entropy, the Renyi relative entropies, and the Chernoff and Hoeffding 
distances. 

There are also differences between the classical and the quantum cases in the technical 
conditions needed to prove the monotonicity. For the approach followed here, it is important 
that the defining function is not only convex but operator convex, and the map is not only 
positive but it is also decomposable in the sense of Remark 14.81 It is unknown whether the 
monotonicity can be proved without these assumptions in general, although Corollary 13.41 
and Lemma 13.51 show for instance that positivity of $ might be sufficient in some special 
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cases. For measures that have an operational interpretation in state discrimination, like the 
relative entropy, the Renyi a-relative entropies with a G (0, 1), and the Chernoff and Hoeffding 
distances, the monotonicity holds for any positive trace non-increasing map <3> such that $ <8>n 
is positive for every n G N [HI [M]. Note that this is satisfied by every completely positive 
map $, but it is neither necessary nor sufficient for the dual of $ to satisfy the Schwarz 
inequality. Indeed, transposition in some basis has this property but it is not a Schwarz map. 
On the other hand, we have shown in Corollary 12.51 and Remark [22] that transposition actually 
preserves any /-divergence (/ doesn't even need to be convex) and hence it also preserves all 
the above mentioned measures. It is unknown to the authors whether there exists any map, 
other than the transposition, that is not completely positive and yet has the property that 
$ <8)n is positive for every n G N. 

Quantum /-divergences are essentially a special case of Petz' quasi-entropies with K = I 
(see the Introduction) with the minor modification of allowing operators that are not strictly 
positive definite. While the monotonicity inequality in Theorem 14.31 can be proved for the 
quasi-entropies with general K quite similarly to the case K = I, our analysis of the equality 
case in Theorem 15. II doesn't seem to extend to K ^ I. A special case has been treated recently 
in [22], where a characterization for the equality case in the joint convexity of the quasi- 
entropies Sy£(.||.) (see Example 12.71 for K = I) was given for arbitrary K and a G (0, 2). Note 
that joint convexity is a special case of the monotonicity under partial traces (see [JTJ Theorem 
6] or Corollary 14.71 of this paper), while monotonicity under partial traces can also be proven 
from the joint convexity for K's of special type [2H], which in turn implies the monotonicity 
under completely positive trace-preserving maps by using their Lindblad respresentation [SD] . 
For a particularly elegant recent proof of the joint convexity for general K's, see [TTj . 

Various characterizations of the equality in the case K = I have been given before for 
different types of maps and classes of functions, including the equality case for the strong 
subadditivity of entropy and the joint convexity of the Renyi relative entropies JTSJ [2H ESI 
ESI H21 SSI ill 113 S3 HE] • Our Theorem O extends all these results and it seems to be the 
most general characterization of the equality, at least in finite dimension. The relevant part 
from the point of view of application to quantum error correction is that the preservation of 
some suitable distinguishability measure yields the reversibility of the stochastic operation, 
and the reversal map can be constructed from the original one in a canonical way. There 
are various technical conditions imposed in Theorems 15.11 and 17.11 that might be possible to 
remove. For instance, it is not clear whether the support condition in (15.41) is necessary or 
maybe the preservation of S^-ll-) f° r one single t > is sufficient for reversibility. It is also 
an open question whether the surjectivity condition in Theorem 17.71 can be removed. 
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A Commuting operators and the operator Holder in- 
equality 

We will need the following two well-known lemmas in this section. The first one is a general- 
ization of the so-called log-sum inequality, while the second one is a generalization of Jensen's 
inequality for the expectation values of self-adjoint operators. 

A.l Lemma. Let / : [0, +00) — y R be a convex function. Let > 0, 6j > 0, % — 1, . . . , r, 
and define a := Yll=i a «' & := J2l=i Then, 

r 

bf{a/b)<Y,kf{a t /k). (A.l) 

i=l 

Moreover, if / is strictly convex, then equality holds if and only if a^fbi is independent of i. 
Proof. Convexity of / yields that 

f(a/b) - 




which yields flA.lj) . and the characterization of equality is immediate from the strict convexity 



off. □ 

A. 2 Lemma. Let A be a self-adjoint operator and p be a density operator on a finite- 
dimensional Hilbert space %. If / is a convex function on the convex hull of spec(A) then 

f(TrAp)<Trf(A)p. (A.2) 

If / is strictly convex then equality holds in (I A. 21) if and only if p° is a subprojection of a 
spectral projection of A. 

Proof. Let A = ^ a aP a be the spectral decomposition of A. Since {Tr P a p : a G spec(A)} is a 
probability distribution on spec(A), Jensen's inequality yields f (Tr Ap) = f (J2 a fl TrP a p) < 
J2 a f( a ) Tr P a p, and it is obvious that equality holds whenever Tr P a p = for all but one 
a G spec(A). On the other hand, if there are more than one a G spec(v4) such that Tr P a p > 
then the above inequality is strict whenever / is strictly convex. □ 

A. 3 Proposition. Let A,B G A\ t+ be such that A commutes with B and let $ : A\ — > A2 

be a substochastic map such that 3>(A) commutes with &{B). For any convex function 
/: [0,+oo)^R, 

S f mA)\MB))<S f (A\\B). (A.3) 

If supp A < supp-B, Tr $(£>) = Tr B and / is strictly convex then equality holds in (1A.3|) if 
and only if $* B ($(A)) = A. 
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Proof. Let us consider first the inequality ( ]A.3j) . Due to the continuity property given in 
Proposition 12.121 we can assume without loss of generality that suppA < suppi?. Since A 
and B commute, there exists a basis {e x } xe x in suppf? such that A = Ylxex A(x)\e x ) (e x \ and 
B = 12xeX B ( x )\ e x)( e xl wnere M x ) '■= ( e x,Ae x ), B(x) := (e x ,Be x ), x G X. Similarly, there 
exists a basis {f y } y ey in supp$(£>) such that $(A) = ^2 ye y^(A)(y)\fy)(f y \ and $(-B) = 
Eyey®(B)(y)\fvKfvl where := (f y ,$(A)f y ), ${B)(y) := (f y ,<S>(B)f y ). We have 

S f (A\\B) = £>(*)/ (|||) , S /( $(A)||$(£)) = 5>(B)(l/)/ (||^|y) • 

Let T xy := (/„,$(|e.>(e,|)/ l ,>; then = Y*vtT ata A{x), <$>{B){y) = E^T^s), 

and Lemma [A. II yields 

Since $ is substochastic, J^j/ey^j/ — 1> an( ^ summing over y in (IA.4[) yields flA.3[) . 

Assume now that suppA < supp B and Tr$(Z?) = Tr B\ then = Tt B — Tr $(.£?) = 
^ £ g A . J B(x)(l - J2 ye yTxy), and hence J^yT^, = 1, x G Af. Obviously, equality holds in 
(1A.3|) if and only if ( ]A.4jl holds with equality for every y G 3^- Assuming that / is strictly 
convex, we obtain, due to Lemma IA. lj. that for every y G 3^ there exists a positive constant 
c(y) such that T^A^) = c(y)T xy B(x), i.e., 

= c{y)B(x) (A.5) 

for every £ such that T xy > 0. Assume that flA.5j) holds; then we have $(A)(?/) = T^A^) = 
Y JX T xyc{y)B(x) = c(y)$(B)(y) and hence, 

$M$(A))(x) = B(x) = B ^)Y. T *y^ = A ^ u 

The following Proposition gives an important special case where the monotonicity inequal- 
ity (1A.3P holds even though A and B don't commute and / is only assumed to be convex. 

A.4 Proposition. Let A, £> G A + be such that B ^ 0, let B = J2bes P cc(B) ^Qb be the spectral 
decomposition of B and let £b '■ X \-> J2b& P cc{B) QbXQb be the pinching defined by B. For 
every convex function / : [0, +oo) — > R, 

S f (A\\B) > S f (£ B (A)\\£ B (B)) = S f (£ B (A)\\B) > (TrB)f f^j . (A.6) 

Moreover, is / is strictly convex then the first inequality in (I A . 6 1) holds with equality if 
and only if A commutes with B, and the second inequality holds with equality if and only if 
£b(A) is a constant multiple of B. In particular, Sf(A\\B) = (Tr B)f (ij^g) if and only if A 
is a constant multiple of B. 

Proof. All the assertions are obvious when A = 0, so for the rest we assume A ^ 0. Assume 
first that supp A < supp-B. For every b G spec(-B) and A G R, let Pj[ be the spectral 
projection of QbAQ b corresponding to the singleton {A}, and let P^ := QbP^Qb- Note 
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that P® = Pf ) for every A / 0, and Q b = ^2\P\^- The spectral projection of £ B (A) 
corresponding to the singleton {A} is ^2 b£spcc ^ P\ ■ For every b G spec(P) \ {0} and A G 1, 

let p b \ be a density operator such that p b \ = Pj®/TrPj® whenever P^ ^ 0. By ( I2.6p . we 
have 

S f (£ B (A)\\£ B (B)) = S f (£ B (A)\\B) = £ J>/(A/&)Tr £ P^Q,, 



= E E 6 /( A /&)TrPf= £ ^6/(Tr((A/6)p M ))TrPf 

&<Espec(B)\{0} A bespcc(B)\{0} A 

< £ ^foTr/(A/6K A TrPf = £ ^ foTr /(A/6)Pf (A.7) 

feGspec(_B)\{0} A &Gspec(B)\{0} A 

= ]T &Tr/(,4/&)Q 6 = E E 6 /(°/ 6 ) TrP a Q 6 = £/(A||P), 

feSspec(S)\{0} 6gspcc( J B)\{0} agspec^) 

where A = aP a is the spectral decomposition of A, and the inequality in (IA. T[) follows 
due to Lemma IA.21 This yields the first inequality in (I A . 6 [) . If A commutes with B then 
£ B (A) = A and hence the first inequality in ( 1A.6j) holds with equality. Conversely, assume 
that the first inequality in (1A.6j) holds with equality; then the inequality in f lA.7j) has to hold 
with equality as well. If / is strictly convex then this implies that for every b G spec(P) \ {0} 
and A G R, there exists an a(b, A) such that P^ < P a {b,\)i due to Lemma [A. 21 In particular, 
P^' commutes with A, and, since Q b = Y1\P\ \ so does also Q b , which finally implies that 
B commutes with A. 

Consider now the stochastic map $ : A — > C, $(A) := TrX, X G A. Since £ B (A) and 
B, as well as Q(£ B (A)) = Tr A and <&(B) = TrB, commute, the second inequality in (1A.6I) 
follows due to Proposition IA.31 which also yields that this inequality holds with equality if 
and only if £ B (A) = $ B ($(£ B (A)) = (Tr A/TiB)B. 

Finally, consider the general case where supp A < supp B does not necessarily hold. For 
every e > 0, let B £ := B + el. Note that supp A < supp B e and £ Be = £ B for every e > 0, and 

hence by the above, S f (A\\B £ ) > S f (£ B (A)\\B £ ) > (TrP £ )/ for every e > 0. Taking 

the limit £ \ then yields ([Qjl . □ 



The first inequality above was proved for the case / = f a , a > 1, in Section 3.7 of 
and we followed essentially the same proof here. It was also proved in Section 3.7 of [Tj 
that the monotonicity inequality ( I4.20p extends for the values a G (2, +oo) if <&(A) and $(P) 
commute. We conjecture that this holds in more generality, namely that the monotonicity 
inequality Sf(<f>(A)\\<f>(B)) < Sf(A\\B) holds for every convex / if A and B or and 
$(P) commute. The inequality S/(A||P) > (TrP)/(^) was given in Theorem 3 of [4T] 
for the case where A and B are invertible density operators and / is a non-linear operator 
convex function. Note that the inequality between the first and the last term in flA.61) is a 
non-commutative generalization of the generalized log-sum inequality (1A.1I) . 

A. 5 Corollary. For any positive semidefinite operators A, B on a finite-dimensional Hilbert 

space H, we have 

Tr A a B 1 ~ a < (TrAj^TrB) 1 " , a G [0, 1]. (A.8) 
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If, moreover, supp A < supp B then 

Tr A a B 1 ~ a > (TrA) a (Tr a G [l,+oo). (A.9) 

If supp A < supp 5 then Tr A a B 1 ~ a = (Tr A) a (Tr Bf~ a for some a G (0, +00) \ {1} if and 
only if A is a constant multiple of B. 

Proof. The assertions are trivial when A or i? is equal to zero, and hence we assume that 
both of them are non-zero. The inequality in (IA.8[) is obvious when a = or a = 1, and the 
inequality in flA.9j) is obvious when a = 1. For a G (0, +oo)\{l}, the inequalities in flA.8j) and 
(IA.9j) follow immediately by applying Proposition IA.4l to the functions f a (x) := sgn(a — l)x a . 
Since these functions are strictly convex for every a G (0, +00) \ {1}, if equality holds in flA.8j) 
or flA.9p . and supp A < supp I?, then A is a constant multiple of B, due to Proposition IA.4I 
Conversely, the inequalities fl A. 8j) and (IA.9[) obviously hold with equality if A is a constant 
multiple of B. □ 

Let % be a finite-dimensional Hilbert space. For every A G B{l-C) and p G M \ {0}, let 

WAW -=l ' A = ' 

" " p ' 1 (Tr \A\py/P, A^O, 



where \A\ := V A* A. For p G [1, +00), this is the well-known p-norm. Note that 

\\A%=\\A\\ p =\\\A\\\ p 

for every A G B{U) and p6l \ {0}. 

Corollary IA.5I yields the following inverse Holder inequality: 

A. 6 Proposition. Let p G (0, 1) and q < be such that l/p+ 1/q = 1. Let A,BE B(H) for 
some finite-dimensional Hilbert space "H, and assume that supp |A| < supp \B*\. Then 

ll^lli > ||A|| P ||5||, (A.10) 

Moreover, the equality case occurs in the above inequality if and only if \A\ P and \B*\ q are 
proportional, i.e., \A\ P = a\B*\ q for some a > 0. 

Proof. The assertion is obvious if A or B is zero, and hence we assume that both of them are 
non-zero. Let A = U\A\ and B* = V\B*\ be the polar decompositions with U,V unitaries. 
Then AB = U\A\ \B*\V\ and hence \\AB§ X = \\\A\\B*\\\. Let A := \A\*, B := \B*\i and 
a := 1/p. Then a > 1 and supp A < supp B by assumption, and hence 

TrL4||£*| = TiA a B 1 ~ a > (Tr A) a (Tr B) x ~ a = (Tr \A\ p ) 1/p (Ti \B*\ p ) 1/p = \\A\\ p \\B\\ q , 

where the inequality follows due to Corollary IA.51 It is well-known that |TrX| < {{X^ 
for every X G B(H); indeed, if X = Yli s i\fi)( e i\ ^ s a singular- value decomposition then 
|TrX| = \Y,i^iJi)\ < Ei^i = = ll^llr Hence, Tr \A\\B*\ < MlB*^ = \\AB\\ V 

which completes the proof of the inequality (IA.10I) . The characterization of the equality case 
is immediate from Corollary IA.51 □ 



A. 7 Remark. Our interest in the inverse operator Holder inequality was motivated by 
The inequality was proved in [T7] for positive semidefinite operators, using the usual Holder 
inequality. An alternative direct proof for the general case and the condition for the equality 
was obtained in [20], based on majorization theory [41 [T9]. 
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